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Title 

eollectin-complement activating protein chimeras. 
Field of invention 

The present invention relates to a fusion protein capable of activating the comple- 
ment system, methods for producing said fusion protein as well as pharmaceutial 
composition comprising said fusion protein and methods for treating diseases, in 
particular infections, with said fusion protein. 

Background of invention 



Animals have developed different complex strategies to protect themselves against 
infections. The immune responses can be divided into to main groups, the adaptive 

15 immune response, in which an adaptation has taken place and In which cells play a 
dominant part and the innate immune response, which is available instantly and 
which primarily is based on molecules present in the body fluids. The innate immune 
system is operational at time of birth, in contrast to the adaptive immune defence 
which only during infancy^ obtains its full power of protecting the body (Janeway et 

20 a/., 1999). 

Bacteria entering the body at mucosal surfaces or through broken skin are immedi- 
ately recognised by coilectins, a family of soluble proteins that recognise distinctive 
carbohydrate configurations that are present on the surfaces of microbes and ab- 

25 sent from the cells of the multicellular organism. Coilectins thus belong to the large 
and diverse group of pattern recognition receptors of the innate immune system. In 
humans, three coilectins are known, although others may exist: cows for example 
have more. Coilectins target the particles to which they bind either for uptake by 
phagocytes or for activation of the complement cascade, and in these ways can 

30 mediate their destruction. 

Coilectins all exhibit the following architecture: they have an N-terminal cysteine-rich 
region that appears to form inter-chain disulfide bonds, followed by a collagen-like 
region, an a-helical coiled-coil region and finally a C-type lectin domain which is the 
35 pattern-recognizing:region and is referred to as the carbohydrate recognition domain - 
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(CRD). The name collectin is derived from the presence of both collagen and lectin 
domains. The a-helical coiled-coil region initiates trimerisation of the individual poly- 
petides to form collagen triple coils, thereby generating collectin subunits each con- 
sisting of 3 individual polypeptides, whereas the N-terminal region mediates forma- 
5 tion of oligomers of subunits. Different collectins exhibit distinctive higher order 
structures, typically either tetramers of subunits or hexamers of subunits. The 
grouping of large numbers of binding domains allows collectins to bind with high 
avidity to microbial cell walls, despite a relatively low intrinsic affinity of each individ- 
ual CRD for carbohydrates. 

10 

C-type CRDs are found in proteins with a widespread occurrence, both in phyloge- 
netic and functional perspective. The different CRDs of the different collectins en- 
able them to recognise a range of distinct microbial surface components exposed on 
different microorganisms. The terminal CRDs are distributed in such a way that all 

15 three domain target surfaces that present binding sites has a spacing of approxi- 
mately 53 A (Sheriff et aL, 1 994; Weis & Drickamer, 1 994). This property of 'pattern 
recognition' may contribute further to the selectively binding of microbial surfaces. 
The collagenous region or possibly the N-terminal tails of the collectins, are recog- 
nised by specific receptors on phagocytes; and is the binding site for associated 

20 proteases that are activated to initiate the complement cascade upon binding of the 
CRD domain to a target. 

Mannan-binding lectin (MBL) also tenmed mannose-binding lectin or mannose bind- 
ing protein is a collectin which has gained great interest as an important part of the 

25 innate Immune system. MBL binds to specific carbohydrate structures found on the 
surface of a range of microorganisms including bacteria, yeast, parasitic protozoa 
and viruses, and has been found to exhibit antibacterial activity through killing medi- 
ated by activation of the temninal, lytic complement components or through promo- 
tion of phagocytosis. MBL deficiency is associated with susceptibility to frequent 

30 infections by a variety of microorganisms in childhood, and possibly also In adults. 

. The CRD of MBL recognises preferentially hexoses with equatorial 3- and 4-OH 
. groups, such as mannose, glucose, /V-acetylmannosamin and A/-acetyl glucoseamin 
while carbohydrates which do not fulfil this sterical requirement, such as galactose , 
35 . and D-fucose, are not bound (Weis et al., 1992). The carbohydrate selectivity is ob- 



SUBSTITUTE SHEET (RULE 26) 



wo 2004/024925 ^^PCT/DK2003/«00585 

3 

viously an important aspect of the self/non-self discrimination by MBL and is proba- 
bly mediated by tlie difference in prevalence of mannose and N-acetyl glucoseamin 
residues on microbial surfaces, one example being the high content of mannose' in 
the cell wall of yeasts sudh as Saccharomyces cerevisiae and Candida albicans. 
5 Carbohydrate structures in glycosylation of mammalian proteins are usually com- . 
pleted with sialic acid, which prevents binding of MBL to these oligomeric carbohy- 
drates and thus prevents MBL recognition of 'self surfaces. Also, the trimeric struc- 
ture of each MBL subunit may be of importance for target recognition. 

to Complement is a group of proteins present in blood plasma and tissue fluid that aids 
the body's defences following an infection. The complement system is being acti- 
vated through at least three distinct pathways, designated the classical pathway, the 
alternative pathway, and the MBLectin pathway (Janeway et aL, 1999). The classi- 
cal pathway is initiated when complement factor 1 (C1) recognises surface-bound 

15 immunoglobulin. The CI complex is composed of two proteolytic enzymes. CI r and 
C1s. and a non-enzymatic part, Clq, which contains immunoglobulin-recognising. 
domains. C1q and MBL shares structural features, both molecules having a bou- 
quet-like appearance when visualised by electron microscopy. Also, like C1q, MBL 
is found in complex with two proteolytic enzymes, the mannan-binding lectin associ- 

20 ated proteases (MASP). The three pathways all generate complement factor 3 (C3) 
convertase, which ensures the binding of C3b to the surface of the activating sur- 
face, /.e. the targeted microbial pathogen. Conversion of C3 into surface bound C3b 
is pivotal in the process of eliminating the microbial pathogen by phagocytosis or 
lysis (Janeway et aL, 1999). 

25 

Certain O-antigen specific oligosaccharides of Salmonella have been reported to 
activate complement in C4-deficient guinea-pig serum and Salmonella serogroup C 
was later shown to react with MBL and hence activate complement by the MBLectin 
pathway, which is also termed the MBL pathway of complement activation or the 
30 lectin pathway. 

It has for some time been speculated that the innate immune system may collabo- 
rate with the adaptive immune system in the generation of specific immune re- 
sponses as exemplified by the antibody response after infection or vaccination. 
35 Fearon's group have shown that the attachment of the C3d fragment of complement 
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factor C3 onto a protein antigen through fusion by gene technology can in crease 
the immunogenecity of the antigen 1000 fold or more. Priactical applications of this 
technique, or any number of modifications, are still awaited. 

5 The importance of the complement system for normal immune responses was first 
suggested by Pepys, who found impaired antibody responses to sheep erythrocytes, 
a thymus-dependent antigen, in mice that were C3-depleted with cobra venom fac- 
tor. The idea of a link between innate and adaptive immunity was supported by re- 
ports demonstrating reduced primary antibody responses to thymus dependent anti- 

10. gens and impaired IgM to IgG switching in patients and experimental animals with 
deficiencies of C4, C2 and C3. The mechanism may involve the generation of C3- 
derived ligands for binding of antigen or antigen-containing complexes to comple- 
ment receptors on B lymphocytes or antigen-presenting cells. Thus, blocking of CR1 
(CD35) and CR2 (CD21) in mice with specific anti CR1 and anti-CR2 antibodies or 

15 with soluble receptor protein reduced antibody responses to immunisation and ex- 
periments with CR1 and CR2-defident knock-out rhice show the requirement of 
these receptors for responses to thymus-dependent antigens. In addition, patients 
with leucocyte adhesion deficiency, who lack the GDI 1/CD1 8 adhesion molecule 
CR3, demonstrate impaired antibody responses and failure to switch from IgM to 

20 IgG. The C3-derived fragment C3d, a specific CR2 ligand, as mentioned above, 
show a strong dose-dependent adjuvant effect. 

Deficiencies of the classical complement pathway (C1, C2, C4 and C3) are associ- 
ated with infections by encapsulated bacteria. The main reason for this is probably 

25 the reduced efficiency of opsonic and bactericidal defence mechanisms caused by 
complement dysfunction. However, impaired immune responses to polysaccharide 
antigens might also be considered. The influence of complement on responses to 
thymus-independent antigens has not been extensively studied, and the available 
infonnation is contradictory. Thus, low antibody responses to thymus-indeplendent 

30 antigens have been clearly documented in C3-depleted mice and C3-deficient dogs. 
On the other hand, some reports find that C3-deficient patients appear to respond 
normally to immunisation with polysaccharide vaccines. 

Ficolins, like MBL, are lectins that contain a collagen-like domain. Unlike MBL, how- 
35 ever, they have a f ibrinogen-like domain, which is similar to fibrinogen p- and y- 
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chains. Ficolins also forms oligomers of structural subunits, each of which is com- 
posed of three identical 35 kDa polypeptides. Each subunit is composed of an 
amino-terminal, cysteine-rich region; a collagen-like domain that consists of tandem 
repeats of Gly-Xaa-Yaa triplet sequences (where Xaa and Yaa represent any amino 
5 acid); a neck region; and a fibrinogen-like domain. The oligomers of ficolins com- 
prises two or more subunits, especially a tetrameric form of ficolin has been ob- 
served. 

Some of the ficolins triggers the activation of the complement system substantially in 
10 similar way as done by MBL. This triggering of the complement system results in the 
activation of novel serine proteases (MASPs) as described above. 

The fibrinogen-like domain of several lectins has a similar function to the CRD of C- 
type lectins including MBL, and hereby function as pattern-recognition receptors to 
1 5 discriminate pathogens from self. 

Serum ficolins have a common binding specificity for GlcNAc (N-acetyl- 
glucosamine), elastin or GalNAc (N-acetyl-galactosamine). The fibrinogen-like do- 
main is responsible for the carbohydrate binding. In human serum, two types of fico- 

20 lin, known as L-ficolin (P35, ficolin L, ficolin 2 or hucolin) and H-ficolin (Hakata anti- 
gen, ficolin 3 or thermolabile b2-macroglycoprotein), have been identified, and both 
of them have lectin activity. L-ficolin recognises GlcNAc and H-ficolin recognises 
GalNAc. Another ficolin known as M-ficolin (P35-related protein, Ficolin 1 or Ficolin 
A) is not considered to be a serum protein and is found in leucocytes and in the 

25 lungs. L-ficoIin and H-ficolin activate the lectin-complement pathway in association 
with MASPs. M-Ficolin, L-ficolin and H-ficolin has calcium-independent lectin activ- 
ity. 

MASPs (MBL-associated serine proteases) comprising MASP-1, MASP-2 and 
30 MASP-3 are proteolytic enzymes that are responsible for activation of the lectin 

pathway. The overall structure of MASPs resembles that of the two proteolytic com- 
ponents of the first factor in the classical complement pathway, C1 r and C1s. The 
lectin pathway is initiated when MBL or a ficolin associated with MASP-1 , MASP-2, 
MASP-3 and sMAP binds to a carbohydrate structure of the surfaces- of e.g. bacte- 
35 ria, yeast, parasitic protoxoa, viruses. MASP-2 is the enzyme component that - like 
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cis in the classical pathway - cleaves the complement components C4 and C2 to 
fomi the C3 convertase C4bC2a, which Is common to both the lectin- and classical- 
pathway activation routes. 

5 MASPtI , MASP-2, MASP-3 and sMAP are encoded by two genes; sMAP is a trun- 
cated form of MASP-2, and MASP-3 is produced from the MASP-1 gene by alterna- 
tive splicing. The MASP-1 gene has an H-chain-encoding region that is common to 
MASP-1 and MASP-3, which is followed by tandem repeats of protease-domain- 
encoding regions that are specific to MASP-3 and MASP-1 . 

10 

The MASP family can be divided into two phylogenetic lineages - TCN-type and 
AGY-type lineages. The TCN-type lineage, which includes MASP-1 , has a TON co- 
don (where N denotes A, G, C or T) that encodes the active-site serine, the pres- 
ence of a histidine-loop disulphide bridge and split exons. By contrast, the AGY-type 
1 5 lineage, which includes MASP-2, MASP-3, CI r and C1s. is characterised by an 

AGY codon (where Y denotes C or T) that encodes the active-site serine, the al>- . 
sence of a histidine-loop and a single exon. 

MASP-1, MASP-2, MASP-3, Cir and Cis consist or six domains: two 
20 C1 r/C1 s/UegfA)one morphogenetic protein 1 (CUB) domains; an epidermal growth 
factor (EGF)-like domain; two complement control protein (CCP) domains or short 
consensus repeats (SCRs), and a serine-protease domain. HIstidine (H), aspartic 
acid (D) and serine (S) residues are essential for the fomnation of the active centre 
iri the serine-protease domain. Only MASP-1 has two additional cysteine residues in 
25 a light chain, which fomi a histidine.loop disulphide bridge (S-S), as is found in tryp- 
sin and.chymotrypsin. On binding of MBL and ficolins to carbohydrate on the-surface 
of a pathogen, the pro-enzyme form of a MASP is cleaved between the second CCP 
and the protease domain, which results in an active fonn that consists of two poly- 
peptides - heavy and light chains (also known as A and B chains). 

30 

Summary of invention 

. The present invention relates to fusion proteins capable of activating the comple- 
ment system. Accordingly, the present invention relates to a fusion protein compris- 
35 ing . 
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a first polypeptide sequence derived from a lectin-complement pathway activating . 
protein (=complement activating protein) or a functional homologue thereof; and 
a second polypeptide sequence derived from a collectin or a functional homologue 
5 thereof; 

wherein said complement activating protein is not a collectin. 

The fusion protein is suitable for use in treatment consisting of creation, reconstltu- 
10 tlon, enhancing and/or stimulating the opsonic and/or bactericidal activity of the 

cpmplement system, i.e. enhancing the ability of the immune defence to recognise 
and kill microbial pathogens, and accordingly, the invention relates to a medicament 
comprising the fusion protein. 

15 Also, in another aspect the invention relates to a method of treatment of a clinical 

condition in an individual in need tiiereof comprising administering to said individual 
the fusion protein as defined above. 

In another aspect the invention relates to a method of treatment or prophylaxis of a 
20 clinical condition, such as infection, in an individual in need thereof comprising ad- 
ministering to said individual a the fusion protein a first polypeptide sequence de- 
rived from a protein capable of forrning oligomers of structural units; and ■ 
a second polypeptide sequence derived from a mannose binding lectin (MBL, 
wherein said first polypeptide sequence and said second peptide sequence is not 
25 derived from the same protein, and said fusion protein is capable of associating with 
mannose-associated serine protease (MASP). The first polypeptide sequence is 
preferably derived from a protein capable of forming tetramers, pentamers, and/or 
hexamers of a structural unit. In a preferred embodiment the first polypeptide se- 
quence and the second polypeptide sequence are as described below. 

30 

In a further aspect the invention relates to use of the fusion protein as defined above 
for the preparation of a medicament for the treatment of a clinical condition in an 
individual in need thereof. 
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Furthermore the invention relates to a method for producing the fusion protein, as 
well as an isolated nucleic acid sequence encoding the fusion protein, a vector 
comprising the sequence and a cell comprising the vector. 

5 Drawings 

Figure 1 shows the sequence of L ficolin 
Figure 2 shows the sequence of MBL 
Figure 3 shows an example of a fusion protein 
1 0 Figures 4-8 show plasmids as described in Example 1 

Figure 9 shows an alignment of fusion proteins described in Example 1 
Figure 10 shows two Western blots as discussed in Example 2. 

Definitions 

15 

Collectins: A family of structurally related, carbohydrate-recognising proteins of in- 
nate immunity, including mannan-binding lectin (MBL) and surfactant proteins A and 
D. The name refers to the presence of a collagen-like region and a C-type lectin 
domain. 

20 

Complement: A group of proteins present in blood plasma and tissue fluid that aids 
the body's defences following an infection. Complement is involved in destroying 
foreign cells and attracting phagocytes to the area of conflict in the body. 

25 • Conjugated: An association fomned between two compounds for example between 
an immunogenic determinant and a collectin and/or collectin homologue or between 
an immunogenic determinant and a saccharide. The association may be a physical 
association generated e.g. by the formation of a chemical bond, such as e.g. a co- 
valent bond. , 

30 

CRD: Carbohydrate recognition domain, a C-type lectin domain that is found at the 
C-terminus of collectins. . . 

Immunogenic determinant: A molecule, or a part thereof, containing one or more 
35 epitopes that will stimulate the immune system of a host organism to make a secre- 
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tory, humoral and/or cellular antigen-specific response, or a D^s^A molecule which is 
capable of producing such an immunogen in a vertebrate. 

Immune response: Response to a immunogenic composition comprising an imrnu- 
5 nogenic determinant. An Immune response involves the development in the host of 
a cellular- and/or humoral immune response to the administered composition or 
vaccine in question. An Immune response generally involves the action of one or 
more of I) the antibodies raised, ii) B cells, iii) helper T cells, Iv) suppressor T cells, 
v) cytotoxic T cells and iv) complement directed specifically or unspecifically to an 
10 immunogenic determinant present in an administered Immunogenic composition. 

Lectin: Proteins that specifically bind carbohydrates. 

MASP: Mannose-associated serine protease 

MBL: Mannan-binding lectin or mannose-binding lectin. 

Subunit complex=structural unit: complex of three individual fusion proteins, like the 
subunit complex discussed above for MBL and fiqolins. 

Detailed description of the invention 



15 



20 



An object of the present invention is to provide a fusion protein capable of activating 
the complement system in order to aid in preventing or treating diseases, in particu- 
25 lar infectious diseases. 

The fusion protein is composed of 

a first polypeptide sequence derived from a lectin-complement pathway activating 
30 protein (=complement activating protein) or from a functional homologue thereof; 
and 

a second polypeptide sequence derived from a pollectin or from a functional homo- 
logue thereof; 

35 
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wherein said complement activating protein is not a collectin, 

By combining- a polypeptide sequence derived from a lectin-complement patliway 
activating protein and a polypeptide sequence derived from a collectin it is possible 
5 to design a fqsion protein having binding affinity for a variety of carbohydrates, pref- 
erably bacterial and viral carbohydrates and at the sanie time having complement 
system activating activity- 
First polypeptide sequence 

10 

The first polypeptide sequence may be derived from any lectin-complement path- 
way activating protein. Said lectin-complement pathway activating protein may be 
naturally occurring lectin-complement pathway activating protein as well as variants 
or homologues to said lectin-complement pathway activating proteins, wherein said 
1 5 variants or homologues have maintained the lectin-complement pathway activating 
activity. 

It is preferred that the fusion protein is capable of forming subunit complexes, each 
consisting of 3 individual fusion proteins as defined above. 

20 

Also the first polypeptide sequence is preferably capable of forming oligomeric com- 
plexes with the first polypeptide sequence of another fusion protein, wherein said 
another fusion protein may be identical to the first fusion protein. Thereby an oli- 
gomeric complex of two or more fusion proteins or two or more subunit complexes 
25 may be provided, said oligomeric complex having a higher binding avidity for bacte- 
rial or viral carbohydrates than the monomeric fusion protein. In a preferred em- 
bodiment the oligomeric complex is a dimeric subunit complex, more preferably a 
trimeric subunit complex, more preferably a tetrameric subunit complex. 

30 In a preferred.embodiment the lectin-complement pathway activating protein is a 
. ficolin as defined above. Said ficolin may be L-ficolin, H-ficolin or M-ficolin or vari- 
ants .or homologues thereof. In a preferred embodiment the lectin-complement 
pathway activating protein is L-ficoiin. 
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In another embodiment the first polypeptide sequence comprises the fibrinogen do- 
main of the lectin, and/or the neck region of a lectin, such as a ficolin or a homo- 
logue or a variant thereof. 

5 When the first polypeptide sequence is derived from ficolin or from a variant or a 

homologue of ficolin, it is preferred that the first polypeptide sequence comprises the 
collagen-like domain from ficolin or from a variant or homologue of ficolin. In another 
embodiment it is preferred that the first polypeptide sequence comprises the Cystein 
rich domain from ficolin or from a variant or homologue of ficolin. It is even more 
1 0 preferred that the first polypeptide sequence comprises the collagen-like domain 
and the Cystein rich domain from ficolin or from a variant or homologue of ficolin. 

It is more preferred that the first polypeptide sequence comprises the collagen-like 
domain from L-ficolin or from a variant or homologue of L-ficolin. In another em- 
15 bodiment it is more preferred that the first polypeptide sequence comprises the 
Cystein rich domain from L-ficoIin or from a variant or homologue of L-ficolin. It Is 
even more preferred that the first polypeptide sequence comprises the N-terminal 
region of L-ficolin including two Cystein amino acid residues, 

20 It is even more preferred that the first polypeptide sequence comprises the collagen- 
like domain and the Cystein rich domain from L-ficolin or from a variant or homo- 
logue of L-ficoIin. 

In a particular preferred embodiment the ficolin has one of the sequences listed be- 
25 low with reference to their database and aqcession No. For each of the sequences 
the Cystein rich region and the collagen-like region is described. 

30 NP_003656. ficolin 3 precursor; ficolin (collagen/fibrinogen domain-containing) 3 
(Hakata antigen) [Homo sapiens] [gi:4504331] 

.90..299 /region^name- !pfam00147, fibrinogen_C, Fibrinogen beta and gamma 
chains, C-terminal globular domain" 
35 90..299 /region_name="smart00186, FBG, Fibrinogen-related domains (FReDs); 

Domain present at the C-termini of fibrinogen beta and gamma chains, and a variety 
of fibrinogen-related proteins, including tenascin and Drosophila scabrous" 

1 mdllwilpsi wllllggpac Iktqehpscp gpreieaskv vllpscpgap gspgekgapg 
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61 pqgppgppgk mgpkgepgdp vnllrcqegp rncrellsqg atlsgwyhic Ipegralpvf 
121 cdmdtegggw Ivfqrrqdgs vdffrswssy ragfgnqese fwlgnenlhq Itlqgnweir 
181 veledfngnr tfahyatfrl Igevdhyqia Igkfsegtag dslslhsgrp fttydadhds 
241 snsncavivh gawwyascyr snlngryavs daaahkygid wasgrgvghp yrrvrmmir 

5 

XP_1 16792. similar to Ficolin 2 precursor (Collagen/fibrinogen domain-containing 
protein 2) (Ficoiin-B) (Ficolin B) (Serum lectin P35) (EBP-37) (Hucolin) (L-Ficolin) 
[Homo sapiens] [gi:20477458] 

10 

91. .168 /region_name="pfam00147, fibrinogen_C, Fibrinogen beta and gamma 
chains, C-terminal globular domain" 

91 ..168 /region_name="smart001 86, F.BG, Fibrinogen-related domains (FReDs); 
Domain present at the C-termini of fibrinogen beta and gamma chains, and a variety 
15 of fibrinogen-related proteins, including tenascin and Drosophila scabrous" 

1 rrigpallalsf Iwtmaltedt cpamieyval nsepgmaskn psrrhglsll wdqgpgarg 
61 vrtdqgpsga dpgsleihge cpifpseqvi Ithhnnypfs tedqdndrda encavhyqga 
121 wwyaschlsh Ingvylggar dsftnginwk sgkgnnysyk vsemkvrpt 

20 

000602. Ficolin 1 precursor (Collagen/fibrinogen domain-containing protein 1) (Fi- 
colin-A) (Ficolin A) (M-Rcolin) [gi:20455484] 

1..29 /gene="FCN1" /region_name="Signar /note="POTENTIAL." 
25 30..326 /gene="FCN1 " /region_name="i\/lature chain" /note="FICOLIN 1 

55.. 93 /gene="FCN1" /region_name="Domain" /note="COLLAGEN-LIKE." 
133 /gene="FCN1" /region_name="Conflict" /note="T -> N (IN REF. 1)." 
144..290 /gene="FCN1" /region_name="Domain" /note="FIBRINOGEN O 
TERMINAL." 

30 287 /gene="FCN1 " /region_name="Conf lict" /note="N -> S (IN REF. 1 )." 

305 /gene="FCN1" /site_type="glycosylation" /note="N-LINKED (GLCNAC.) (PO- 
TENTIAL)." 

1 melsgatmar glavllvlfl hiknipaqaa dtcpevkwg legsdkltil rgcpglpgap 
35 61 gpkgeagvig ergerglpga pgkagpvgpk gdrgekgmrg ekgdagqsqs catgpmckd 

121 lldrgyflsg whtiyipdcr pitvlcdmdt dgggwtvfqr rmdgsvdfyr dwaaykqgfg 
181 sqigefwign dnihaltaqg sseln/dlvd fegnhqfaky ksfkvadeae kyklvlgafv 
241 ggsagnsltg hnnnffstkd qdndvsssnc aekfqgawwy adchasning lylmgphesy 
301 anginwsaak gykysykvse mkvrpa // 

40 

075636. Ficolin 3 precursor (Collagen/fibrinogen domain-containing protein 3) 
(Collagen/fibrinogen domain-containing lectin 3 P35) (Hakata antigen) [gi:13124185l 

1..21 /gene="FCN3" /region_name="Signal" /note="POTENTIAL." 
45 22..299 /gene="FCN3" /reglon_name="Mature chain" /note="FICOLIN 3." 

48..8b /gene="FCN3" /region_name="Domain" /note="COLLAGEN-LIKE." 

50 /gene="FCN3" /site_type="hydroxylation" 

53 /gene="FCN3" /site_type="hydroxylation" . 

59 /gene="FCN3" /site_type="hydroxylation" 
50 65 /gene="FCN3" /site_type="hydroxy!ation" 

68 /gene="FCN3" /site_type="hydroxylation" 

77 /gene="FCN3" /site_type="hydroxylation" 



SUBSTITUTE SHEET (RULE 26) 



wo 20U4/024925 PCT/DK2<)03/0()0585 

13 

1 19..265 /gene="FCN3" /region_name="Domain" /note="FIBRINOGEN C- 
TERMINAL." , _ 

189 /gene="FCN3" /site_type="glycosylation" /note="N-LINKED (GLCNAC.) (PO- • 

TENTIAL)." 

1 mdllwilpsi wllllggpac Iktqehpscp gpreleaskv vllpscpgap gspgekgapg 
61 pqgppgppgk mgpkgepgdp vnllrcqegp mcrellsqg atlsgwyhic Ipegralpvf 
121 cdmdtegggw Ivfqrrqdgs vdffrswssy ragfgnqese fwlgnenlhq Itlqgnweir 
181 veledfngnr tfahyatfr) Igevdhyqia Igkfsegtag dslsJhsgrp fttydadhds 
10 241 snsncavivh gawwyascyr sningryavs daaahkygid wasgrgvghp yrrvrmmir 

XP_130120. similar to Ficolin 2 precursor (Collagen/fibrinogen domain-containing 
protein 2) (Ficolin-B) (Ficolin B) (Serum lectin P35) (EBP-37) (Hucolin) (Mus mus- 
culus] [gi:20823464] 

15 

59..95 /region_name="Collagen triple helix repeat (20 copies)" /note="Collagen" 
/db xref="CDD: Dfam01 391" 

59..89 /region_name="Collagen triple helix repeat (20 copies)" /note="Collagen" 
/dh xref="CDD: Dfam01 391" 
20 60..95 /region_name="Collagen triple helix repeat (20 copies)" /note="Collagen" 
/dh xref="CDD: Dfam01 391" 

60..95 /region_name="Collagen triple helix repeat (20 copies)" /note="Collagen" 
/dh xref="CDD: Dfam01 391" 

60..95 /region_name="Collagen triple helix repeat (20 copies)" /note="Collagen" 
25 /db xref="CDD: Dfam01 391 " 

60.. 95 /region_name="Collagen triple helix repeat (20 copies)" /note="Collagen" 
/dh xref="CDD: Dfam01 391" 

60..95 /region_name="Collagen triple helix repeat (20 copies)" /note="Collagen" 
/dh xref="CDD: Dfam01391" 
30 61 . .95 /region_name="Collagen triple helix repeat (20 copies)" /note="Collagen" 
/db xref="CDD: pfam0139r ' 

61.. 95 /region_name="Collagen triple helix repeat (20 copies)" /note="Collagen" 
/db xref="CDD: pfam0139r 

61 ..95 /region_name="Collagen triple helix repeat (20 copies)" /note="Collagen:' 
35 /dh xref="CDD: Dfam0139r 

103..312 /region_name="Fibrinogen beta and gamma chains, C-temiinal globular 
. domain" /note="fibrinogen_C" /dh_xref="CDD: Dfam001 47 " 
103..3i2 /region_name="Fibrinogen-reiated domains (FReDs)" /note=TBG" 
/db xref="CDD: smart00186 " 

40 

1 malgsaalfv Itltvhaagt cpelkvldle gykqitilqg cpglpgaagp kgeagakgdr 
61 gesglpgipg kegptgpkgn qgekgirgek gdsgpsqsca tgprtckell tqghfltgwy 
121 tiylpdcrpi tvlcdmdtdg ggwtvfqrri dgsvdffrdw tsykrgfgsq Igefwigndn 
1 81 ihalttqgts eirvdisdfe gkhdfakyss fqiqgeaeky klilgnfigg gagdsltphn 
45 241 nrlfstkdqd ndgstsscam gyhgawwysq chtsnln^ly Irgphksyan gvnwkswrgy 
301 nysckvsemk vrii 

NP_056654. ficolin 2 isofGrm d precursor; ficolin (collagen/fibrinogen domain- 
50 • containing lectin) 2 (hucolin); ficolin (collagen/fibrinogen dornain-containing lectin) 2; 
hucolin [Homo sapiens] [gi: 8051 590] 

39..95 /region_name="collagen-like domain" 
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1 meldravgvl gaatlllsfl gmawalqaad tcpevkmvgl egsdkltilr gcpglpgapg 
61 dkgeagtngk rgergppgpp gkagppgpng apgepqpcit gd 

5 NP_056653. ficolin 2 isoform c precursor, ficolin (collagen/fibrinogen domain- 
containing lectin) 2 (liucolin); ficolin (collagen/fibrinogen domain-containing lectin) 2; 
hucolin [Homo sapiens] [gi:8051588] . 

39.-95 /region_name="coliagen-like domain" 
10 102..143 /region_name="Fibrinogen beta and gamma chains, C-terminal globular 
domain" /note="fibrinogen_C" /dh xref="CDD:pfam001 47" 
102..143 /region_name="Fibrinogen-related domains (FReDs)" /note-'.'FBG" 
/db_xref="CDD:smartQ0186" 

15 1 meldravgvl gaatlllsfl gmawalqaad tcpevkmvgl egsdkltilr gcpglpgapg 

61 dkgeagtngk rgergppgpp gkagppgpng apgepqpcit gprtckdild rghflsgwht 
1 21 iylpdcrpit vicdmdtdgg gwtvsvglgr ggqpgspggq aahlvgehtl efsillvgds 
181 qr 

20 NP_056652. ficolin 2 isoform b precursor; ficolin (collagen/fibrinogen domain- 
containing lectin) 2 (hucolin): ficolin (collagen/fibrinogen domain-containing lectin) 2; 
hucolin [Homo sapiens] Igi:8051 5861 

sia peptide 1..25 

25 mat peptide 26. .275 „ 
60..275 /region_name="FBG domain" /note="fibrinogen beta/gamma homology 
64.. 275 /region_name="Fibrinogen-related domains (FReDs)" /note="FBG" 
/db xref="CDD: smart001 86 " 

64..274 /region_name="Fibrinogen beta and gamma chains, C-terminal globular 
30 domain" /note="fibrinogen_C" /db_xref="CDD:EfamQ0147" 

1 meldravgvl gaatlllsfl gmawalqaad tcpgergppg ppgkagppgp ngapgepqpc 
61 Itgprtckdl Idrghflsgw htiylpdcrp Itvlcdmdtd gggwtvfqrr vdgsvdfyrd 
121 watykqgfgs rigefwignd nihaltaqgt selrvdlvdf ednyqfakyr sfkvadeaek 
35 181 ynlvigafve gsagdsltfh nnqsfstkdq dndlntgnca vmfqgawwyk nchvsnlngr 
241 ylrgthgsfa nginwksgkg ynysykvsem kvrpa 

NP_001994. ficolin 1 precursor; ficolin (collagen/fibrinogen domain-containing) 1 
[Homo sapiens] [gi:8051 584] 

40 . 

• sia peptide 1 ..27 
mat peptide 28..326 

40.. 108 /region_name="collageh-like domain" 

50-105 /region_name="Collagen triple helix repeat (20 copies)" /note="Collagen" 
45 /db xref="CDD: pfam01 39 1 " 

51 ..107 /region_name="Collag6n triple helix repeat (20 copies)" /note="Collagen 
/rib xref="CDD.: pfam01 391" 

. 52..1 06 /region_name="Gollagen triple helix repeat (20 copies)" /note="Collagen 
/db xref ="CDD -Pf am01 39 1 " 
50 115. .326 /region_name="FBG domain" /note="fibrinogen beta/gamma homology 
115..326 /region_name="Fibrinogen-related domains (FReDs)" /note="FBG" 
/db xref="CDD: smartOQ186" 
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1 1 5,.325 /reglon_name="Fibrinogen beta and gamma chains, C-terminal globular 
domain" /note="fibrinogen_C" /rib xref="CDD: Dfam00147 " variation 315 
/dh xref="dbSNP: 1 128428" variation 316 /db_xref="dbSNP:1128429" variation 317 
/rih_xref="dbSNP: 1 1 28430" 

1 melsgatmar glavllvlfl hiknipaqaa dtcpevkwg legsdkltil rgcpglpgap 
61 gpkgeagvig ergerglpga pgkagpvgpk gdrgekgmrg ekgdagqsqs catgprnckd 
121 lidrgyflsg whtiylpdcr pitvlcdmdt dgggwtvfqr rmdgsvdtyr dwaaykqgfg 
181 sqigefwign dnlhaltaqg sselrvdivd fegniiqfaky ksfkvadeae kyklvlgafv 
241 ggsagnsltg hnnnffstkd qdndvsssnc aekfqgawwy adchasning lylmgphesy 
301 anginwsaak gykysykvse mkvrpa . 

NP_004099. ficolin 2 isoform a precursor ficolin (collagen/fibrinogen domain- 
containing lectin) 2 (hucolin); ficolin (collagen/fibrinogen domain-containing lectin) 2; 
hucolin [Homo sapiens] [gi:4758348] 

sig peptide 1..25 
. mat DeDtide 26..313 
39..95 /region_name="collagen-iike domain" 

98..313 /region_name="FBG domain" /note="fibrinogen beta/gamma liomology" 
1 02. .31 3 /region_name="Fibrinogen-related domains (FReDs)" /note="FBG" 
/rih xref="CDD: smart001 86" 

102..312 /region_name="Fibrinogen beta and gamma chains, C-tenminal globular 
domain" /note="fibrinogen_C" /dh_xref="CDD:Dfam001 47 

1 meldravgvl gaatlllsfl gmawalqaad tcpevkmvgi egsdkltilr gcpglpgapg 
61 dkgeagtngk rgergppgpp gkagppgpng apgepqpcit gprtckdlld rghflsgwlit 
121 iylpdcrpit vlcdmdtdgg gwtvfqrrvd gsvdfyrdwa tykqgfgsri gefwigndni 
1 81 haltaqgtse Irvdivdfed nyqfakyrsf kvadeaekyn Ivlgafvegs agdsltfhnn 
241 qsfstkdqdn dintgncavm fqgawwyknc hvsnlngryl. rgthgsfang inwksgkgyn 
301 ysykvsemkv rpa 

Q9WTS8. Ficolin 1 precursor (Collagen/fibrinogen domain-containing protein 1) (Fi- 
colin-A) (Ficolin A) (M-Ficolin) [gi:13124116l 

1..22 /gene="FCNr /region_name="Signal" /note="POTENTIAL." 
23 335 /gene="FCN1" /region_name="Mature chain" /note="FICOLIN 1 ." 
50.,88 /gene="FCN1" /region_name="Domain" /note="COLlJ\GEN-LIKE." 
152..298 /gene="FCNr /region_name="Domain" /note="FIBRINOGEN C- 

TERIVIINAL." • ^ 

271 /gene="FCNr /site_type="glycosylation" /note="N-LINKED (GLCNAC.) (PO- 
TENTIAL)." 

1 mwwpmlwafp vllclcssqa Igqesgacpd vkivglgaqd kvaviqscps fpgppgpkge 
61 pgspagrger glqgspgkmg ppgskgepgt mgppgvkgek gergtaspig qkelgdalcr 
121 rgprsckdil trgifltgwy tiyipdctpl tvlcdmdvdg ggwtvfqrrv dgsinfyrdw 
181 dsykrgfgnl gtefwigndy Ihlltangnq eirvdirefq gqtsfakyss fqvsgeqeky 
241 kitlgqfleg.tagdsltkhn nmafsthdqd ndtnggknca alfhgawwyh dchqsnlngr 
301 yipgshesya dginwisgrg hrysykvaem kiras 

Q15485 Ficolin 2 precursor (Collagen/fibrinogen domain-containing protein 2) (Fi- 
colin-B) (Ficolin B) (Serum lectin P35) (EBP-37) (Hucolin) (L-Ficolin) [gi: 13124203] 
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1 ..25 /gene="FCN2" /region_name="Signar' /note="POTENTIAL." 
26..31 3 /gene="FCN2" /region_name="Mature chain" /note="FICOLIN 2." 
54..92 /gene="FCN2" /regi6n_name="Domain" /note="COLLAGEN-LIKE." 
1 31 ..277 /gene="FCN2" /region_name="Domain" /note=''FIBRINOGEN C- 
5 TERMINAL." 

240 /gene="FCN2" /site_type="glycosylation" /note="N-LINKED (GLCNAC.) (PO- 
TENTIAL)." 

300 /gene="FCN2''/site_type="glycosylation'' /note="N-LINKED (GLCNAC.) (PO- 
. TENTIAL)." 

10 

1 meldravgvl gaatlllsfl gmawalqaad tcpevkmvgl egsdkltilr gcpglpgapg 
61 dkgeagtngk rgergppgpp gkagppgpng apgepqpcit gprtckdild rghflsgwht 
121 iylpdcrpit vicdmdtdgg gwtvfqrrvd gsydfyrdwa tykqgfgsri gefwigndnl 
181 haltaqgtse In/dlvdfed nyqfakyrsf kvadeaekyri Ivlgafvegs agdsltfhnn 
15 241 qsfstkdqdn dintgncavm fqgawwyknc hvsnlngryl rgthgsfang inwksgkgyn 

301 ysykvsemky rpa 

070497. Ficolin 2 precursor (Collageii/fibrinogen domain-containing protein 2) (Fi- 
colih-B) (Ficolin B) (Serum lectin P35) (EBP-37) (Hucolin) [gi:13124181] 

20 

<1..15 /gene="FCN2" /region_name="Signal" /note="POTENTIAL." 
16..>306 /gene="FCN2" /region_name="Mature chain" /note="FICOLIN 2." 
41. .79 /gene="FCN2" /region_name="Domain" /note="COLLAGEN-LIKE." 
130..276 /gene="FCN2" /region_name="Domain" /note="FIBRINOGEN C- 
25 TERMINAL." 

299 /gene="FCN2" /site_type="glycosylation" /note="N-LlNKED (GLCNAC.) (PO- 
TENTIAL)." 

1 IgsaalfvIt Itvhaagtcp elkvldlegy kqltilqgcp gipgaagpkg eagakgdrge 
30 61 sglpgipgke gptgpkgnqg ekgirgekgd sgpsqscatg prtckelltq ghfltgwyti 
1 21 yipdcrpmtv Icdmdtdggg wtvfqrrldg svdffrdwts ykrgfgsqig efwigndnih 
181 alttqgtsel rvdisdfegk hdfakyssfq iqgeaekyki ilgnflggga gdsltphnnr 

241 Ifstkdqdnd gstsscamgy hgawwysqch tsnlnglylr gphksyangv nwkswrgyny 
301 sckvse 

35 ■ 

070165. Ficolin 1 precursor (Collagen/fibrihogen domain-containing protein 1) (Fi- 
colin-A) (Ficolin A) (M-Ficolin) [gi: 1 31 241 79] 

1..22 /gene="FCN1" /region_name="Signal" /note="POTENTIAL." 
40 23..334 /gene="FCN1 " /region_name="Mature chain" /note="FICOLIN 1 ." 
50..88 /gene="FCN1" /region_name="Domain" /note="COLLAGEN-LIKE." 
152..298 /gene="FCN1" /region_name="Domain" /note="FIBRINOGEN C- 
TERMINAL." 

2617gene="FCN1" /site_type="glycosylation" /note="N-LINKED (GLCNAC.) (PO- 
45 TENTIAL)." 

i rtiqwptlwafs gliclcpsqa Igqergacpd vkvvglgaqd kvwiqscpg fpgppgpkge 
61 pgspagrgergfqgspgkmg pagskgepgtmgppgvkgekgdtgaapslgekeigdticq • 
121. rgprsckdil trgifltgwy tihlpdcrpl tvlcdmdvdg ggwtvfqrrv dgsidffrdw 
50 181 dsykrgfgnigtefwlgndylhiitangnqelrvdiqdfqgkgsyakyssfqvseeqeky 
241 kitjgqfleg tagdsltkhn nmsftthdqd ndansmncaa Ifhgawwyhn ch'qsnlngry 
. 301 Isgshesyad ginwgtgqgh hysykvaemk iras 
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P57756. Rcolin 2 precursor (Collagen/fibrinogen domain-containing protein 2) (Fico- 
lin-B) (Ficolin B) (Serum lectin P35) (EBP-37) (Hucolin) [gi:13124114] 

1 ..22 /gene="FCN2" /region_name="Signal" /note="POTENTIAL" 
5 23..319 /gene="FCN2" /region_name=" Mature chain" /note="FICOLIN 2." 
48..86 /gene="FCN2" /region_name="Domain" /note="COLLAGEN-LIKE." 
137..283 /gene="FCN2"7region_name="Domain" /note="FIBRINOGEN C- 
TERMINAL." 

306 /gene="FCN2" /siteJype="glycosylation" /note="N-LINKED (GLCNAC.) (PO- 
10 TENTIAL)." 

1 mvlgsaalfv Islcvteltl haadtcpevk vidlegsnki tilqgcpglp galgpkgeag 
61 akgdrgesg! pghpgkagpt gpkgdrgekg vrgekgdtgp sqscatgprt ckelltrgyf 
121 ItgwytiyIp dcrpltvlcd mdtdgggwtv fqrridgtvd ffrdwtsykq gfgsqigefw 
15 181 Igndnihalt tqgtnelrvd.ladfdgnhdf akyssfqiqg eaekyklilg nflgggagds 

241 Itsqnnmlfs tkdqdndqgs sncavryhga wwysdchtsn Inglylrglh ksyangvnwk 
301 swkgynysyk vsemkvrii 

JC5980: f icolin-A precurs ^ mouse [gi:751 3652] 
20 1..21 /region_name="domain" /note="signal sequence" 
50..64 /region_name="domain" /note="collagen-like" 
68..106/region_name="domain" /note="colIagen-like" 

123..334 /region„name="domain" /note="fibrinogen beta/gamma homology #laber 
FBG" 

25 

1 mqwptlwafs gllclcpsqa Igqergacpd vkvvglgaqd kwviqscpg fpgppgpkge 
61 pgspagrgergfqgspgkmg pagskgepgt mgppgvkgek gdtgaapslg ekelgdticq 
121 rgprsckdil trgifltgwy tihlpdcrpl tvlcdmdvdg ggwtvfqrrv dgsidffrdw 
181 dsykrgfgnl gtefwigndy Ihlltangnq eirvdiqdfq gkgsyakyss fqvseeqeky 
30 241 kltlgqfleg tagdsltkhn nmsftthdqd ndansmncaa Ifhgawwyhn chqsnlngry 
301 Isgshesyad ginwgtgqgh hysykvaemk iras 

S61517. ficolin-1 precurs- human [gi:2135116] 
1..326 /note="36K HL^-cross-reactive plasma protein; hucolin, 35K" 
35 1 ..22 /region_name="domain" /note="signal sequence" 
52. . 1 08 /regiori_name="region" /note="collagen-like" 

1 15.. 326 /region_name="domain" /note="fibrinogen beta/gamma homology #Iabel 
FBG" 

305 /site_type="binding" /note="carbohydrate (Asn) (covalent)" 
40 . . 

1 melsgatmar glavllvlfl hiknipaqaa dtcpevkvvg legsdkltil rgcpgipgap 
61 gpkgeagvig ergerglpga pgkagpvgpk gdrgekgmrg ekgdagqsqs catgprnckd 
121 lldrgyflsg whniylpdcr pitvlcdmdt dgggwtvfqr rmdgsvdfyr dwaaykqgfg 
181 sqigefwign dnihaltaqg sselrvdivd fegnhqfaky ksfkvadeae kyklvlgafv 
45 241 ggsagnsltg hnnnffstkd qdndvsssnc aekfqgawwy adchassing lylmgphesy 
301 anginwsaak gykysykvse mkvrpa 

A47172. transfonming growth factor-beta 1-binding protein homolog ficolin-alpha - 
pig [gi:423206] 

50 

1 12.. 323 /region_name="domain" /note="fibrinogen beta/gamma. homology #label 
FBG" 
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1 mdtrgvaaam rplvllvafi ctaapaldtc pevkwgleg sdklsilrgc pglpgaagpk 
61 geagasgpkg gqgppgapge pgppgpkgdr gekgepgpkg esweteqcit gprtckellt 
121 rghilsgwht iylpdcqpit vicdmdtdgg gwtvfqrrsd gsvdfyrdwa aykrgfgsql 
1 81 gefwlgndhi haltaqgtne Irvdivdfeg nhqfakyrsf qvadeaekym Ivlgafvegn 
5 241 agdsltshnn sifttkdqdn dqyasncavl yqgawwynsc hvsnlngryl ggshgsfang 
301 vnwssgkgyn ysykvsemkf rat 

JC4942. flcolin-1 precursor - human [gi:2135117| 

10 1 ..22 /region_name="domain" /note="signal sequence" 
45..1 01 /region_name="region"./note="collagen-like" 

108..319 /region_name="domain" /note="fibrinogen beta/gamma homology #label 
FBG" 

1 1 1 ..31 5 /region_name="region" /note=Tibrinogen-like" 
15 298 /site_type="binding" /note="carbohydrate (Asn) (covalent)" 

1 marglavllv Iflhiknipa qaadtcpevk wglegsdkl tilrgcpglp gapgpkgeag 
61 vigergergl.pgapgkagpv gpkgdrgekg mrgekgdagq sqscatgprn ckdildrgyf 
121 Isgwhtiyip dcrpltvlcd mdtdgggwtv fqrrmdgsvd fyrdwaaykq gfgsqigefw 
20 181 Igndnihalt aqgsselrvd Ivdfegnhqf akyksfkvad eaekyklvlg afvggsagns 

241 Itghnnnffs tkdqdndvss sncaekfqga wwyadchasn Inglylmgph esyanginws 
301 aakgykysyk vsemkvrpa 

AAF4491 1 . symbbl=BG:DS00929...[gi:7287873] 

25 ^ . . 

1 mkscffvlfl \Artllfevgqs sphtcpsgsp ngihqlmlpe eepfqvtqck ttardwiviq 
61 rrldgsvnfn qswfsykdgf gdpngeffig Iqklylmtre qphelfiqik hgpgatvyah 
121 fddfqvdset elyklervgk ysgtagdsir yhinkrfstf drdndesskn caaehgggww 
1 81 fhsclsr 

30 

The first polypeptide preferably comprises at least 10, such as at least 12, for exam- 
ple at least 15, such as at least 20, for example at least 25, such as at least 30, for 
example at least 35, such as at least 40, for example at least 50 consecutive amino 
35 acid residues of the complement activating protein or of a variant or a homologue to 
said protein. Such a variant or homologue is preferably at least 70%, such as 80%, 
for example 90%, such as 95% identical to the complement activating protein. 



The first polypeptide sequence of the fusion protein is preferably capable of activat- 
40 ing the lectin-complement pathway when bound directly or indirectly to a target, 
such as a bacteria or a virus. In a preferred embodiment the first polypeptide se- 
quence is capable of associating- with at least one MASP protein, such as a MASP 
protein selected from the group consisting of MASP-1 , MASP-2 and MASP-3 or 
' functional homoiogues or variants hereof. In particular the first polypeptide is capa- 
45 ble of associating with said at least one MASP protein when being part of the fusion 
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protein. Thereby the first polypeptide sequence is capable of providing the fusion 
protein with complement system activating activity. 

In a particular preferred embodiment the first polypeptide sequence comprises at 
5 least the amino acid residues corresponding to 1-54 of L-ficoiih sequence of Figure 
1 , such as 1 -55 of L-flcolih sequence of Figure 1 , such as 1 -69 of L-ficolin sequence 
of Figure 1, such as 1-77 of L-ficolin sequence of Figure 1, such as 1-90 of L-ficolin 
sequence of Figure 1, such as 1-93 of L-ficolin sequence of Figure 1, such as 1-131 
of L-ficolin sequence of Figure 1, such as 1-207 of L-ficolin sequence of Figure 1 . In 

10 particular the first polypeptide sequence comprises the amino acid residues selected 
from: 1 -55 of L-ficolin sequence of Figure 1 , 1-54 of L-ficolin sequence of Figure 1 , 
1-50. or 1-77 of L-flcolin sequence of Figure 1. In a more preferred embodiment the 
first polypeptide sequence has the amino acid residues selected from: 1-55 of L- 
ficolin sequence of Figure 1, 1-54 of L-ficolin sequence of Figure 1, 1-50, or 1-77 of 

1 5 L-ficolin sequence of Figurie 1 . In another embodiment the first polypeptide se- 
quence comprises at least the amino acid residues corresponding to 60-90 of L- 
ficolin sequence of Figure 1. such as 55-90 of L-ficolin sequence of Figure .1, such 
as 54-92 of L-ficolin sequence of Figure 1 . • 

20 It is preferred the first polypeptide sequence and the second polypeptide sequence 
are selected to include the motif X-X-G-X-X-G at least 5 times, such as at least 7 
times, preferably in a consecutive sequence. It is more preferred to select the first 
polypeptide sequence and the second polypeptide sequence so that the aforemen- 
tioned motif is substituted once with the motif X-X-G-X-G. In the motifs X means any 

25 amino acid different from Glycine, and G means Glycine. 

Second polypeptide sequence 

The second polypeptide sequence is preferably capable of associating with one or 
30 more carbohydrates. This may be accomplished by incorporating at least the carbo- 
. hydrate recognizing domain of the collectin in question. Accordingly, the second 
polypeptide sequence preferably comprises the CRD domain of a collectin or a 
homologue or a variant thereof. 
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Preferably the collectin is selected from the group consisting of MBL (mannose- 
bindlng lectin), SP-A (lung surfactant protein A), SP-D (lung surfactant protein D), 
BK (or BC, bovine congiutinin) and CL-43 (collectin-43). Most preferably the collectin 
Is MBL 

5 

In a particular preferred embodiment the collectin has one of the sequences listed 
below with reference to their database and accession No. 



Collectins 

10 

SEQ ID NO: 42 

Q9NPY3 Complement component C1q receptor precursor (Complement component 
1, q subcomponent, receptor 1) (C1qRp) (C1qR(p)) (Clq/MBL/SPA receptor) (CD93 
1 5 antigen) (CDw93) gi|21 759074|sp|Q9NPY3|CD93_HUMAN[21 759074] 

FEATURES Location/Qualifiers source 1..652 /organism-'Homo sapiens" 
/db_xref="taxon:9606" 
gene 1.. 652 /gene="C1QRr /note="CD93" 
20 Protein 1 ..652 /gene="C1QR1" /product="Complement component Clq receptor 
precursor" 

Region 1,.21 /gene="C1QR1"/region_name="Signai" 
Region 22..652 /gene="C1 QR1" /region_name="Mature chain" 
/note="COMPLEMENT COMPONENT C1Q RECEPTOR." 
25 Region 22 /gene="C1QR1" /region_name="Conflict" /note="T -> V (IN AA SE- 
QUENCE)." 

Region 24..580 /gene="C1QR1" /region_name="Domain" /note="EXTRACELLULAR 
(POTENTIAL)." 

Region 32..174 /gene="C1QR1" /region_name=" Domain" /note="C-TYPE LECTIN." 
30 Region 36 /gene="C1QRr /region_name="Conflict" /note="C -> T (IN AA SE- 
QUENCE)." 

Region 38..39 /gene="C1QR1" /region_name="Conflicr. /note="TA -> Rl (IN AA 
SEQUENCE)." 

Region 155 /gene="C1QR1" /region_name="Conflict" /note="S -> N (IN REF. 1)." 
35 Region 1 86 /gene="C1 QR1 " /region_name="Conf lict" /note="G -> A (IN AA SE- 
QUENCE)." 

Region 260..301 /gene="C1QR1" /reglon_name="Domain" /note="EGF-LIKE 1." 
Bond bond(264,275) /gene="C1QRr /bond_type="disulfide" /note="BY SIMILAR- 
ITY." 

40 Bond bond(271,285) /gene="C1QRr /bond_type="disulfide" /note="BY SIMILAR- 
. ITY." 

Bond bond(287,300) /gene="C1QR1" /bond_type="disulfide" /note="BY SIMILAR- 
ITY." 

Region 302..344 /gene="C1QRr /regi6n_name="Domain" /note="EGF-LIKE 2." 
45 Bond bond(306,317)7gene="C1QR1" /bond_type="disulfide" /note="BY SIMILAR- 
ITY." 

Bond bond(31 1 ,328) /gene="C1 QR1 " /bond_type="disuIfide" /note="BY SIMILAR- 
ITY." 
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Region 31 8 /gene="C1 QR1" /region_name="Variant" /note="V -> A. • 
/FTId=VAR_013573." 

Site 325 /gene="C1QR1" /site_type="glycosylation" /note="N-LINKED (GLCNAC.) 
(POTENTIAL)." 

5 Bond bond(330.343) /gene="C1 QR1 " /bond_type="disulfide" /note="BY SIMILAR- 
ITY." 

Region 345..384 /gene="C1QR1" /reglon_name="Domain" /note="EGF-LIKE 3, 
CALCIUi\/l-BINDING (POTENTIAL)." 

Bond bond(349,358) /gene="C1QR1" /bond_type="disulfide" /note="BY SIMILAR- 

10 ITY." „ 
Bond bond(354,367) /gene="C1QRr /bond_type="disu!fide" /note="BY SIMILAR- 
ITY." 

Bond b'ond(369,383) /gene="C1QR1" /bond_type="disulfide" /note="BY SIMILAR- 
ITY." 

1 5 Region 385..426 /gene="C1QR1" /region_name="Domain" /note="EGF-LlKE 4, 
CALCIUM-BINDING (POTENTIAL)." 

Bond bond(389.400) /gene="C1QR1" /bond_type="disulfide" /note="BY SIMILAR- 
ITY." • 
Bond bond(396.409) /gene="C1QRr /bond_type="disulfide" /note="BY SIMILAR- 

20 ITY." 

Bond bond(411,425) /gene="C1QR1" /bond_type="disulfide" /note="BY SIMILAR- 
ITY." 

Region 427..468 /gene="C1QR1" /region_name="Domain" /note="EGF-LIKE 5, 
CALCIUM-BINDING (POTENTIAL)." 
25 Bond bond(431 .443) /gene="C1QR1" /bond_type="disulfide" /note="BY SIMILAR- 
ITY." 

Bond bond(439,452) /gene="C1QR1" /bond_type="disulfide" /note="BY SIMILAR- 
ITY." 

Bond b6nd(454.467) /gene="C1QR1" /bond_type="disuIfide" /note="BY SIMILAR- 
30 ITY." 

Region 492 /gene="C1QR1" /region_name="Conflicr /note="S -> A (IN AA SE- 
QUENCE)." 

Region 496 /gene="C1QR1" /region_name="Conflict" /note="R -> Q (IN AA SE- 
QUENCE)." 

35 Region 504 /gene="C1 QR1 " /region_name="Conflict" /note="R -> G (IN AA SE- 
QUENCE)." 

Region 541 /gene="C1QR1" /region_name="Conflicr /note="P -> S (IN REF. 1). 
Region 581. .601 /gene="C1QR1" /region_name="Transmembrane region" 
• /note="POTENTIAL." 
40 Region 594.. 601 /gene="C1QR1" /region_name="Donnain" /note="POLY-LEU." 

Region 602..652/gene="C1QR1" /region_name=" Domain" /note="CYTOPLASMIC 
(POTENTIAL)." 

ORIGIN 1 matsmgllli llllltqpga gtgadteaw cvgtacytah sgi<lsaaeaq nhcnqnggnl 
61 atvkskeeaq hvqrvlaqll n:eaaltarm skfwiglqre kgkcldpsip Ikgfswvggg 

45 1 21 edtpysnwhk elrnsciskr cvsllldlsq pllpsrlpkw segpcgspgs pgsniegfvc 
1 81 kfsfkgmcrp lalggpgqvt yttpfqttss sleavpfasa anvacgegdk detqshyflc 
241 kekapdvfdw gssgplcvsp kygcnfnngg cliqdcfeggd gsflcgcrpg frilddlvtc 
301 asrnpcsssp crggatcvlg phgknytcrc pqgyqidssq Idcvdvdecq dspcaqecvn 
361 tpggfrcecw vgyepggpge gacqdvdeca Igrspcaqgc tntdgsfhcs ceegyvlage 

50 421 dgtqcqdvde cvgpggplcd sicfntqgsf hcgclpgwvl apngvsctmg pvslgppsgp 
481 pdeedkgekegstvpraata.sptrgpegtpkatpttsrpslssdapitsapikmiapsgs 
. 541 pgvwrepsili hataasgpqe paggdssvat qnndgtdgqk lllfyilgtv vaillllala 
601 Igiivyrkrr akreekkekk pqnaadsysw vperaesram enqysptpgt dc 
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SEQ ID NO: 43 

BAC05523 collectin placenta 1 [Mus musculus] 

gl|21 901 969|dbjlBAC05523.1|[21 901 9691 
5 FEATURES Location/Qualifiers source* 1 ..742 /organism="Mus musculus" 

/db_xref="taxon:10090" /tissueJib="Uver" 

Protein 1 ..742 /product="collectin placenta 1" 

CDS 1 ..742 /gene="CL-P1" /coded_by="AB078434.1:92..2320" 

ORIGIN 1 mkddfaeeee vqsfgykrfg iqegtqctkc knnwalkfsi vllyilcall titvailgyk 
10 61 wekmdnvtd gmetshqtyd nkltavesdl kklgdqagkk alstnselst frsdildlrq 

121 qlqeitekts knkdtleklq angdslvdrq sqiketlqnn sflittvnkt Iqayngyvtn 
181 Iqqdtsvlqg nlqsqnnysqs wimnlnnln Itqvqqrnli sniqqsvddt slaiqriknd 
241 fqniqqvflq akkdtdwike kvqslqtlaa nnsalakann dtledmnsql ssftgqmdni 
301 ttisqaneqs Ikdiqdihkd tenrtavkfs qleerfqvfe tdivniisni sytahhirti 
15 361 tsnlndvrtt ctdtltrhtd dltslnntlv nirldsisir mqqdmmrski dtevanlsw 

421 meeniklvdsk hgqliknfti Iqgppgprgp kgdrgsqgpp gptgnkgqkg ekgepgppgp 
481 agergtigpv gppgergskg skgsqgpkgs rgspgkpgpq gpsgdpgppg 

ppgkdglpgp . ( 

541 qgppgfqglq gtvgepgvpg prglpglpgv pgmpgpkgpp gppgpsgame 
20 plalqneptp 

601 asevngcpph wknftdkcyy fslekeifed akifcedkss hivfinsree qqwikkhtvg 
661 reshwigltd seqesewkwl dgspvdyknw kagqpdnwgs ghgpgedcag 11- 

yagqwndf 

721 qcdeinnfic ekereavpss il 



SEQ ID NO: 45 

AAM34742 46-kDa collectin precursor [Bos taurus] 
gi|2l 1 05685|gb|AAM34742. 1 |AF509589_1 [21 1 05685] 

30 sig_peptide 1..20 

Region 67.. 245 /region_name="collagen-like region" 
Region 245..371 /iregion_name="carbohydrate recognition domain" 
CDS 1..371 /gene="CL-46"/coded_by="join(AF509589.1:1454..1652. 
AF509589. 1:5950.. 6066, AF509589.1:6402..6509, 

35 AF509589.1 :6823..6930,AF509589.1 :7289..7405. 

AF509589. 1 :8021 . .8 1 04, AF509589. 1 : 1 031 8 . . 1 0700)" 

ORIGIN 1 mlllplsvll Iltqpwrslg aemkiysqkt langctlvvc rppegglpgr dgqdgregpq 

61 gekgdpgspg pagragrpgp agpigpkgdn gsagepgpkg dtgppgppgm pgpagregps 

121 gkqgsmgppg tpgpkgdtgp kggmgapgmq gspgpaglkg'ergapgelga pgsagvagpa - 

40 181 gaigpqgpsg argppglkgd rgdpgergak gesgladvna Ikqrvtileg qlqriqnafs 
241 rykkavlfpd gqavgkkifk tagavksysd aqqicreakg qiasprsaae neavaqlvra 
301 knndaflsmn.distegkfty ptgeslvysn wasgepnnnn agqpencvqi yregkwndvp 
361 csepllvicef 

45 SEQ ID NO: 47 

XP_139613 similar to collectin suMamlly member 10; collectin liver 1; collectin 34 

[Mus musculus] 

gi|20903807|ref |XP_1 3961 3. 1 1[20903807] 

FEATURES Location/Qualifiers source 1 ,.420 /organism="Mus musculus" 
50 /strain="C57BL/6J" /db_xref="taxon: 1 0090" /chrpmosome=" 1 5" 

Protein 1.:420 /product="similar to collectin sub-family member 10; collectin liver 1; 
collectin 34" 
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Region 1 52.. 269 /region_name="C-type lectin (CTL) or carbohydrate-recognition 
domain (CRD)" /note="CLECT" /db_xref="CDD:smart00034" 
Region 165..269 /region_name="Lectin C-type domain" /note="lectin_c" 
/db_xref="CDD:pfam00059" 
5 Region 362..419 /region_name="Ubiquitin-conjugating enzyme E2, catalytic domain 
• homologues" /note="UBCc" /db_xref="CDD:smart00212" 

Region 363..419 /region_name="Ubiquitin-conjugating enzyme" /note="UQ_con"' 
/db_xref="CDD:pfam00179" CDS 1..420 /gene="LOC239447" 
/coded_by="XIVI_1 3961 3.1 :1 ..1263" /db_xref="lnterimlD:239447" 
10 ORIGIN 1 mngfrvlirs nismllllal Ihfqslgldv dsrsaaevca tlitispgpkg ddgergdtge 
61 egkdgkvgrq gpkgvkgeig dmgaqgnigk sgpigkkgdk gekgllgipg ekgkagticd 
121 cgryrkvvgq Idisvarlkt smkfiknvia gireteekfy yivqeeknyr esltKcrirg 
181 gmlampkdev vntliadyva ksgffrvfig vndleregqy vftdntplqn ysnwkeeeps 
241 dpsghedcve missgrwndt echltmyfvs slqedliedc Ireqgllvqv tpanqellfg 
15 301 idtflgpmsc vyqrtgtkqk lysqcrlwdg lakkqtneta niatfckgae pnrgsrpcgq 
361 kqemmtlmms gnkgittfpe sdnlfkwvgt migaagtide dikyklslns pwtliihpq 

SEQIDNO:*48 

XP_12321 1 similar to collectin sub-family member 12 [Mus musculus] 
gi|20876566|ref|XP_12321 1.1 1[20876566] 

FEATURES Location/Qualifiers source 1..742 /6rganism="Mus musculus" 
/strain="C57BL/6J" /db_xref="taxon: 1 0090" /chromosome="18" 
Protein 1..742 /product="similar to collectin sub-family member 12" 
Region 79.. 320 /regjon_name="V-type ATPase 1 16kDa subunit family" 
/note="V_ATPase_sub_a" /db_xref="CDD:pfam01496" 
Region 92.. 337 /region_name="lntermediate filament protein" /note="filament" 
/db_xref="CDD:pfam00038" Region 607..731 /region_name="C-type lectin (CTL) or 
carbohydrate-recognition domain (CRD)" /note="CLECT" 
/db_xref="CDD:smart00034" 

Region 624..732 /region_name="Lectin C-type domain" /note="lectin_c" 
/db_xref="CDD:pfam00059" 

CDS 1..742 /gene="LOC225157" /coded_by="XM_1 2321 1.1 :77..2305" 
/db_xref="lnterimlD:2251 57" 

ORIGIN 1 mkddfaeeee vqsfgykrfg iqegtqctkc knnwalkfsi vllyilcall titvailgyk 

61 vvekmdnvtd gmetshqtyd nkltavesdl kklgdqagkk alsinselst frsdildlrq 
121 qlqeitekts khkdtleklq angdslvdrq sqiketlqnn sflittvnkt Iqayngyvtn 
181 Iqqdtsvlqg nlqsqmysqs wimnlnnin Itqvqqrnii sniqqsvddt slaiqriknd 
241 fqniqqvflq akkdtdwtke kvqslqtlaa nnsalakann dtledmnsql ssftgqmdni 
301 ttisqaneqs Ikdiqdihkd tenrtavkfs qleerfqvfe.tdivniisni sytahhirti 
361 tsnlndvrtt ctdtltrhtd ditslnntiv nirldsisir mqqdmmrski dtevanlsw 
421 meemklvdsk hgqiiknfti Iqgppgprgp kgdrgsqgpp gptgnkgqkg ekgepgppgp 
481 agergtigpv gppgergskg skgsqgpkgs rgspgkpgpq gpsgdpgppg 
ppgkdglpgp 

541 qgppgfqglq gtvgepgvpg prglpglpgv pgmpgpkgpp gppgpsgame plalqneptp 
601 asevngcpph wknftdkcyy fslekeifed akifcedkss hivfinsree qqwikkhtvg 
661 reshwigltd seqesewkwl dgspvdyknw kagqpdnwgs ghgpgedcag li- 
yagqwndf • 
721 qcdeinnfic ekereavpss 11 

50 . SEQ ID NO: 49 

NP_571645 mannose binding-like lectin [Danio rerio] 
gi|18858997|ref|NP_571645.1|[18858997] 
slg_peptide 1 ..23 
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mat_pepticle 24. .251 /procluct="nnannose binding-like lectin" 
Region 24..36 /region_name="N-terminai segment" 
Region 33..70 /region_name="Collagen triple helix repeat (20 copies)" 
/note="Collagen" /db_xrefi="CDD:pfam01 391 " 
5 Region 33..70 /region_name="Collagen triple helix repeat (20 copies)" 
. /note="Collagen" /db_xref="CD.D:pfam0139r 

Region 37..101 /region_name="collagen-like structure" 
Region 37..70 /region_name="Collagen triple helix repeat (20 copies)" 
/note="CoIlagen" /db_xref="CDD:pfam01 391" 
10 Region 71 ..74 /region_name="break in collagen structure" 
Region 102.. 132 /region_name="neck region" 

Region 133..251 /region_name="carbohydrate recognition domain" /note- "CRD" 
Region 134..247 /region_name="C-type lectin (CTL) or carbohydrate-recognition 
domain (CRD)" /note="CLECT"/db_xref="CDD:smart00034" 
15 Region 146.. 247 /region_name="Lectin C-type domain" /note="lectin_c" 
/db_xref="CDD:pfam00059" 

CDS 1.,251 /gene="mbr /coded_by="NM_1 31 570.1 :68..823" /note="collectin with 
structural homology to mannose-binding lectin but with a predicted carbohydrate 
specificity for galactose; mannose binding-like lectin" /db_xref="LocuslD:58091" 
20 ORIGIN 1 mallklflga llllqlvlql magaadpqsl ncpayagvpg tpghnglpgr dgrvgrdgan 

61 gpkgekgepg vnvqgppgka gppgpagakg ergpsglpgq dcmsdslkse Iqklsdkial 
121 iekwnfktf kkvgqkyyvt ddveetfdkg mqycssngga Mprtleen allkvfvssa 
181 fkrifiritd rekegefvdt drkkltftnw gpnqpdnykg aqdcgaiads glwddvscds 
241 lypiiceiei k 

25 
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SEQ ID NO: 50 

NP_569057 collectin sub-family member 12, isoform I; scavenger receptor with C- 
type lectin; collectin placenta 1 [Homo sapiens] 
gi|1 8641 360|ref 1NP_569057. 1 1[1 8641 360] 

5 

FEATURES Location/Qualifiers source 1 ..742 /organism="Homo sapiens" 
/db_xref="taxon:9606" /chrombsome="1 8" /map="1 8pter-p1 1.3" 
Protein 1..742 /product="collectin sub-family member 12, isoform I" /note="isoform I 
is encoded by transcript variant I; scavenger receptor with C-type lectin; collectin 
10 placenta 1" ' . 

Region 79..328 /region_name="V-type ATPase 1 16kDa subunit family" 
/note="V_ATPase_sub_a" /db_xref="CDD:pfam01496" Region 443..589 
/reglon_name="collagen-like domain" 

Region 607..731 /region_name="C4ype lectin (CTL) or carbohydrate-recognition 
15 domain (CRD)" /note="CLECT" /db_xref="CDD:smart00034" 

Region 624.. 732 /region_name="Lectin C-type domain" /note- 'lectln_c" 
/db_xref="CDD:pfam00059" 

Region 668..719 /region_name="Beta-lactamase" /note="beta-lactamase" 
/db_xref="CDD:pfam00144" 
20 CDS 1..742 /gene="COLEC12" /coded_by="NM_1 30386.1 :172..2400" 
/db_xref-"LocuslD:81035" 

ORIGIN 1 mkddfaeeee vqsfgykrfg iqegtqctkc knnwalkfsi illyilcall titvailgyk 

61 wekmdnvtg gmetsrqtyd dkltavesdl kklgdqtgkk aistnselst frsdildlrq 
121 qlreitekts knkdtleklq asgdalvdrq sqiketlenn sflittvnkt Iqayngyvtn 
25 181 Iqqdtsvlqg nlqnqmyshn wimnlnnin Itqvqqrnii tnlqrsvddt sqaiqriknd 

241 fqniqqvflq akkdtdwike kvqslqtlaa nnsalakann dtledmnsql nsftgqmenr 
301 ttisqaneqn Ikdiqdihkd aenrtaikfn qleerfqife tdivniisni sytahhirti 
361 tsnlnevrtt ctdtltkhtd dltslnntla nirldsvsir mqqdimrsri dtevanlsvi 
421 meemklvdsk hgqiiknfti Iqgppgprgp rgdrgsqgpp gptgnkgqkg ekgepgppgp 
30 481 agergpigpa gppgerggkg skgsqgpkgs rgspgkpgpq gpsgdpgppg 

ppgkeglpgp 

541 qgppgfqglq gtvgepgvpg prglpglpgv pgmpgpkgpp gppgpsgaw plalqneptp 
601 apedngcpph wknftdkcyy fsvekeifed akifcedkss hivfintree qqwikkqmvg 
661 reshwigltd serenewkwl dgtspdyknw kagqpdnwgh ghgpgedcag liyagqwndf 
35 721 qcedvnnfic ekdretviss at 

SEQ ID NO: 51 

NP_1 10408 collectin sub-family member 12, isoform II; scavenger receptor with C- 
type lectin; collectin placenta 1 [Homo sapiens] 

40 giil8641358|ref|NP_110408.2|[18641358] 

FEATURES Location/Qualifiers source 1..622 /organism="Homo sapiens" 
/db_xref="taxon:9606" /chromosome="1 8" /map="1 8pter-p1 1 .3" 
Protein 1..622 /product="collectin sub-family member 12, isoform 11" /note="isoform 
II is encoded by transcript variant II; scavenger receptor with C-type lectin; collectin 

45 placenta 1" 

Region. 79.. 328 /region_name="V-type ATPase 1 1 6kDa subunit family" 

/note="V_ATPase_sub_a" /db_,xref="CDD:pfam01496" 

Region 443.. 589 /region_name="collagen-like domain" 

CDS 1..622 /gene="COLEC12" /coded_by="NM_030781.2:172..2040" 

50 /db_xref="LocuslD:81035" 

ORIGIN 1 mkddfaeeee vqsfgykrfg iqegtqctkc knnwalkfsi illyilcall titvailgyk 

61 wekmdnvtg gmetsrqtyd dkltavesdl kklgdqtgkk aistnselst frsdildlrq 
121 qlreitekts knkdtleklq asgdalvdrq sqiketlenn sflittvnkt Iqayngyvtn 
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181 Iqqdtsvlqg nlqnqmyshn wimnlnnin Itqvqqrnii tnlqrsvddt sqaiqriknd 
241 fqnlqqvflq akkdtdwike kvqslqtlaa nnsalakann dtledmnsql nsftgqmeni 
301 ttisqaneqn Ikdiqdihkd aenrtaikfn qleerfqife tdivniisni sytahhirti 
361 tsnlnevrtt ctdtltkhtd dltslnntla nirldsvsir mqqdimrsri dtevarilsvi 
5 421 meemklvdsk hgqiiknfti Iqgppgprgp rgdrgsqgpp gptgnkgqkg ekgepgppgp 

481 agergpigpa gppgerggkg skgsqgpkgs rgspgkpgpq gpsgdpgppg 
ppgkeglpgp 

541 qgppgfqglq gtvgepgvpg prglpglpgv pgmpgpkgpp gppgpsgaw plalqneptp 
601 apednskskp slqpggqgsa ca 

10 

SEQ ID NO: 52 

NP_569716 collectin sub-family member 12 [Mus muscuius] 
gi|f8485494|ref|NP_569716.1|[18485494] 

FEATURES Location/Qualifiers source 1..742 /organism="Mus muscuius" 
15 /db_xref="taxon:10090" 

Protein 1..742 /product="collectin sub-family member 12" 

Region 79..320 /region_name=" V-type ATPase 1 1GkDa subunit family" 

/note="V_ATPase_sub_a*' /db_xref="CDD:pfam01496" 

Region 607..731 /region_name="C-type lectin (CTL) or carbohydrate-recognition 
20 domain (CRD)" /note="CLECT" /db_xref="CDD:smart00034" 

Region 629. J32 /region„name="Lectin C-type domain" /note="lectin_c" 
/db_xref="CDD:pfam00059" 

CDS 1..742 /gene="Colec12" /coded_by="NM_1 30449.1 :77..2305" 
/db_xref="LocuslD:140792"/db_xref="MGD:2152907" 
25 ORIGIN 1 mkddfaeeee vqsfgykrfg ihegtqctkc innwalkfsi vllyilcall titvailgyk 

61 vvekmdnvsd gmetshqtyd nkltavesdl kklgdqagkk alstnselst frsdildlrq 
121 qlqeitekts knkdtieklq angdsivdrq sqiketlqnn sflittvnkt Iqayngyvtn 
181 Iqqdtnvlqg nlqsqmysqs wimnlnnin Itqvqqrnii sniqqsvddt slaiqriknd 
241 fqnlqqvflq akkdtdwike kvqslqtlaa nnsalakann dtledmnsql ssftgqmdni 
30 301 ttisqaneqs Ikdiqdihkd tenrtavkfs qleerfqvfe tdivniisni sytahhirti 

361 tsnlndv\Artt ctdtltrhtd ditslnntiv niridsislr mqqdmmrski dtevanlsvv 
421 meemklvdsk hgqiiknfti Iqgppgprgp kgdrgsqgpp gptgnkgqkg ekgepgppgp 
481 agergtigpv gppgergskg skgsqgpkgs rgspgkpgpq gpsgdpgppg 
ppgkdglpgp 

35 541 qgppgfqglq gtvgepgvpg prglpglpgv pgmpgpkgpp gppgpsgame plalqneptp 

601 asevngcpph wknftdkcyy fslekeiled akifcedkss hlvfinsree qqwikkhtvg / 
661 reshwigltd seqesewkwl dgspvdyknw kagqpdnwgs ghgpgedcag 11- 
yagqwndf 

721 qcdeinhfic ekereavpss il 
40 . 
SEQ ID NO: 53 

AAL61856 43kDa collectin precursor [Bos taurus] 
gi|1 82521 1 1 |gblAAL61 856.1 1[1 82521 1 1] 

45 FEATURES Location/Qualifiers source 1..321 /organism="Bos taurus" 
/db__xref="taxon:991 3" 

Protein 1 ..321 /product="43kDa collectin precursor" /name="CL-43; conglutinin; SP- 
D" 

Region 1..166 /region_name="collagen-like" 
50 Region 167.. 193 /region_name="alpha-helical neck" 

Region 195..321 /region_name="carbohydrate-recognition domain" 
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CDS 1..321 /gene=''CL43"/cocled_by="join(AY071822.1:2945..3143, 
AY071 822.1 :5843..5950,AY071 822.1 :6273..6344, 

AY071822.1:6734..6850,AY071822.1:7039..7122. AY071822.1:9525..9910)" 
ORIGIN 1 mlplplsill lltqsqsfig eemdvysekt ltdpctlwc appadslrgh dgrdgkegpq 
5 61 gekgdpgppg mpgpagregp sgrqgsmgpp gtpgpkgepg peggvgapgm 

pgspgpaglk 

121 gergtpgpgg aigpqgpsga mgppglkgdr gdpgekgarg etsvlevdtl rqrmrnlege. 
181 vqrlqmVtq yrkavlfpdg qavgeklfkt agavksysda eqicreakgq lasprssaen 
241 eavtqivrak nkhaylsmnd iskegkftyp tggsldysnw apgepnnrak degpenclei • 
10 301 ysdgnwndie creerl vice f 

SEQ ID NO: 44 

AAL61855 43kDa collectin precursor [Bos taurus] 
gi| 1 82521 09|gb|AAL61 855. 1 1[1 82521 09] 

15 

FEATURES Location/Qualifiers source 1 .:321 /organism=*'Bos taurus" 
/db„xref="taxon:991 3" /tissue Jype="liver" Protein 1..321 /product="43kDa collectin 
precursor" /name="CL-43; conglutinin; SP-D" 
CDS 1..321 /gene="CL43" /coded_by="AY071821.1:172..1137" 
20 ORIGIN 1 mlplplsill lltqsqsfig eemdvysekt ltdpctlwc appadslrgh dgrdgkegpq 
61 gekgdpgppg mpgpagregp sgrqgsmgpp gtpgpkgepg peggvgapgm 
pgspgpaglk 

121 gergtpgpgg aigpqgpsga mgppglkgdr gdpgekgarg etsvlevdtl rqrmrnlege 
181 vqriqnivtq yrkavlfpdg qavgeklfkt agavksysda eqicreakgq lasprssaen 
25 241 eavtqivrak nkhaylsmnd iskegkftyp tggsldysnw apgepnnrak degpenclei 

301 ysdgnwndie creerlvice f 

SEQ ID NO: 46 

BAB22581 data source:SPTR, source key:Q9Y6Z7, evidence:ISS'-homolog to 
30 COLLECTIN 34-putative [Mus musculus] gi|12833584ldbjlBAB22581 .1 HI 2833584] 

FEATURES Location/Qualifiers source 1 ..272 /organism="Mus musculus" 
/strain="C57BL/6J" /db_xref="FANTOM_DB:1010001 HIS" 
/db_xref="MGD:1 904296" /db_xref="taxon:1 0090" /clone="1 01 0001 HI 6" 
35 /sex="male" /tissue_type="heart" /cloneJib="RIKEN full-length enriched mouse 
cDNA library" /dev_stage="adult" 

Protein 1..272 /name="data source:SPTR, source key:Q9Y6Z7, evidence:ISS ho- 
molog to COLLECTIN 34 putative" CDS 1..272 /coded_by="AK003 12 1.1:81.. 899" 
/db_xref="MGD:'1 918943" 
40 ORIGIN 1 mmmrdlalag mlislaflsl Ipsgcpqqtt edacsvqilv pglkgdagek gdkgapgrpg 

61 rvgptgekgd mgdkgqkgtv grhgkigpig akgekgdsgd igppgpsgep gipcecsqir 
121 kaigemdnqv tqlttelkfi knavagvret eskiyllvke ekryadaqis cqarggtism • 
181 pkdeaanglm asylaqagia rvfigindle kegafvysdr spmqtfnkwr sgepnnayde 
241 edcvemvasg gwndvachit myfmcefdke nl 

45 

SEQ ID NO: 54 

NP_034905 mannose binding lectin, liver (A) [Mus musculus] 
• . giI6754654|ref|NP_034905.1|[6754654] 

FEATURES Location/Qualifiers source 1 ..239 /organism="Mus musculus" 
50 /db_xref="taxon:1 0090" /chromosome="14" /map="14 15.0 cM" 
Protein 1..239 /product="mannose binding lectin, liver (A)" 

misc_feature 19..239 /partial /note="mature protein based on homology to rat MPB- 
A" 
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Region 126.. 236 /region_name="C-type lectin (CTL) or carbohydrate-recognition 
domain (CRD)" /note="CLECT" /db_xref="CDD:smart00034" 
Region 1 35.. 237 /region_name="Lectin C-type domain" /note="lectin_c" 
/db_xref="CDD:pfam00059" 
5 CDS 1..239/gene="IVlbl1"/coded_by="NIVl_010775.1:121..840" . 
/db_xref="LocuslD:17194"/db_xref="l\/lGD:96923" 

ORIGIN 1 mlllpllpvl Icwsvsssg sqtcedtlkt csviacgrdg rdgpkgel<ge pgqglrglqg 

61 ppgl<lgppgs vgspgspgpl< gqkgdhgdnr aieeklanme aeirilkskl qltnklinafs 
121 mgkksgkklf vtntiekmpfs kvkslctelq gtvaiprnae enkaiqeyat giaflgitde 
10 1 81 ategqfmyvt ggritysnwk kdepnnhgsg edcviildng Iwndiscqas fkavcefpa 

SEQ ID NO: 55 

NP_034906 mannose binding lectin, seaim (C) [Mus musculus] 
gi|6754656|refINP_034906.1|[6754656] 

15 sigjseptide 1..18 

Region 120..241 /region_name="C-type lectin (CTL) or carbohydrate-recognition 

domain (CRD)" /note="CLECT" /db_xref="CDD:smart00034" 

Region 140..242 /region_name="Lectin C-type domain" /note="lectin_c" 

/db_xref="CDD:pfam00059" 

20 CDS 1 ..244 /gene="iy/ibl2" /coded_by="NM_01 0776.1 :177..91 1" 

/note="polysaccharide-binding component of RaRF; sequence similarity to man- 
nose-binding proteins- /db_xref="LocuslD: 171 95" /db_xref="IVlGD:96924" ORIGIN .1 
rrisiftsflll c\A4vvyaet Itegvqnscp wtcsspgin gfpgkdgrdg akgekgepgq 

61 girglqgppg kvgptgppgn pglkgavgpk gdrgdraefd tseidseiaa Irselralm 

25 121 wvlfslsekv gkkyfvssvk kmsldrvkal csefqgsvat prnaeensai qkvakdiayl 

181 gitdvrvegs fedltgnrvr ytnwndgepn ntgdgedcw ilgngkwndv pcsdsflaic 
241 efsd 

SEQ ID NO: 56 

30 NP_006429 collectin sub-family member 10; collectin liver 1 ; collectin 34 [Homo 
sapiens] gi|5453619|ref|NP_006429.1|[5453619] 

FEATURES Location/Qualifiers source 1..277 /organism="Homo sapiens" 

/db_xref="taxon:9606" /chromosome="8" /map="8q23-q24.1" 

Protein 1 ..277 /product="colIectin sub-family member 10" /note?="collectin liver 1 ; 

35 collectin 34" 

Region 152..271 /region_name="C-type lectin (CTL) or carbohydrate-recognition 

domain (CRD)" /note="CLECT" /db_xref="CDD:smart00034" 

Region 165. .272 /region_name="Lectin C-type domain" /note="lectin_c" 

/db_xref="CDD:pfam00059" 

40 ■ CDS 1..277/gene="COLEC10"/coded_by="NM_006438.2:76..909" . 

/db_xref="LocuslD:1 0584" 

ORIGIN 1 mngfaslln- nqfillvlfl Iqiqslgldi dsrptaevca thtispgpkg ddgekgdpge 

61 egkhgkvgrm gpkgikgeig dmgdrgnigk tgpigkkgdk gekgligipg ekgkagtvcd 
121 cgryrkfvgq Idisiarlkt smkfvknvia gireteekfy yivqeeknyr eslthcrirg 
45 1 81 gmlampkdea antliadyva ksgffrvfig vndleregqy mftdntplqn ysnwnegeps 

241 dpyghedcve missgrwndt echltmyfvc efikkkk 

SEQ ID NO: 57 

BAB72147 collectin placenta 1 [Homo sapiens] 
50 . gi|1 70261 01 |dbj|BAB72147.1|[1 70261 01] 

FEATURES Location/Qualifiers source 1..7.42 /organism="Homo sapiens" 
/db_xref="taxon:9606" /sex="female" /tissue Jib="placenta" 
Protein 1..742 /product="collectin placenta 1" 
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CDS 1..742 /gene="CL-P1" /coded_by="AB005145.1:71..2299" 

ORIGIN 1 mkddfaeeee vqsfgykrfg iqegtqctkc knnwalkfsi illyilcall titvailgyk . 
61 vvekmdnvtg gmetsrqtyd dkltavesdl kklgdqtgkk aistnselst frsdildlrq 
121 qlreitekts knkdtleklq asgdalvdrq sqiketlenn sflittvnkt Iqayngyvtn 
181 Iqqdtsvlqg nlqnqmyshn wimnlnnin Itqvqqmli tnlqrsvddt sqaiqriknd 
241 fqniqqvflq akkdtdwike kvqslqtlaa nnsalakann dtledmnsql nsftgqmeni 
301 ttisqaneqn Ikdiqdihkd aenrtaikfn qleerfqife tdivniisni sytahhirti 
361 tsnlnevrtt ctdtltkhtd ditslnntia nirldsvsir mqqdlmrsrl dtevanlsvl 
421 meemklvdsk hgqiiknfti Iqgppgprgp rgdrgsqgpp gptgnkgqkg ekgepgppgp 
481 agergpigpa gppgerggkg skgsqgpkgs rgspgkpgpq gpsgdpgppg ppgkeglpgp 
541 qgppgfqglq gtvgepgvpg prglpglpgv pgmpgpkgpp gppgpsgavv plalqneptp 
601 apedngcpph wknftdkcyy fsvekeifed akifcedkss hIvRntree qqwikkqmvg 
661 reshwigltd serenewkwl dgtspdyknw kagqpdnwgh ghgpgedcag liyagqwndf 
721 qcedvnnfic ekdretviss al 

SEQ ID NO: 58 

AAF63470 mannose bindihg-like lectin precursor [Carassius auratus] 
gi|7542474|gb|AAF63470.1 |AF227739_1 [7542474] 
sig^peptide <1..13 

Region 14..25 /region_name="N-terminal segment" 

Region 26..93 /region_name="collagen-like structure" 

Region 60.. 63 /region_name="break in collagen structure" 

Region 94.. 124 /region_name="neck region" Region 125..246 

/region_name="carbohydrate recognition domain" /note="CRD" 

CDS 1..246 /gene="MBL" /coded_by="AF227739.1:<1..742" /note="collectin with 

structural homology to mannose-binding lectin but with a predicted carbohydrate 

specificity for galactpse" 

ORIGIN 1 llllqfalql Idgaepqnin cpayggvpgt pghnglpgrd grdgkdgaig pkgekgesgv 

61 svqgppgkag ppgtagekge rgpsgpqgsp gsesvleslk seiqqikaki atfekvssvc 
121 hfrkvgqkyy itdgwgnfd qglkscmefg gtmvsprtsa enqallkivv ssglgskkpy 
181 igvtdrkteg qfvdtegkql tftnwgpgqp ddykglqdcg viedtglwdd ggcgdirpim 
241 ceidik 

SEQ ID NO: .59 

AAF63469 mannose binding-like lectin precursor [Danio rerio] 
gi|7542472|gb| AAF63469. 1 1 AF227738_1 [7542472] 
sig_peptide 1..23 

mat _peptide 24..251 /product- 'mannose binding-like lectin" 
Region 24.. 36 /region_name="N-terminal segment" 
Region 37.. 101 /region_name="collagen-Iike structure" 
Region 71. .74 /region_name="break in collagen structure" 
Region 102..132 /region_name="neck region" 

Region 133.. 251 /region_name="carbohydrate recognition domain" /note="CRD" 
CDS 1 ..251 /gene="mbl" /coded_by="AF227738.1 :68..823" /note="collectin with 
structural homology to mannose-binding lectin but with a predicted carbohydrate 
specificity for galactose" 
• ORIGIN 1 mallklflga llllqlvlql magaadpqsl ncpayagvpg tpghnglpgr dgrvgrdgan 

61 gpkgekgepg vnvqgppgka gppgpagakg ergpsglpgq dcmsdslkse Iqklsdkial 
121 iekwnfktf kkvgqkyyvt ddveetfdkg mqycssngga Ivlprtleen allkvfvssa 
181 fkrifiritd rekegefvdt drkkltftnw gpnqpdnykg aqdcgaiads glwddvscds 
241 lypiiceiei k 
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SEQ ID NO: 60 

AAF63468 mannose binding-like lectin precursor [Cyprinus carpio] 
gi|7542470|gb|AAF63468. 1 1 AF227737_1 [7542470] 

5 . 

sig_j)eptide 1..23 

mat_peptide 24.. 256 /product="mannose binding-like lectin" 
Region 24..35 /region_name="N-terminal segment" 
Region 36.. 1 03 /region_name="collagen-like structure" 
10 Region 70..73 /region_name="break in collagen structure" 
Region 1 04..1 34 /region_nanne="neck region" 

Region 135..256 /region_name="carbohydrate recognition domain" /note="CRD" 
CDS 1..256 /gene="MBL" /coded_by="AF227737.1:67..837" /note="collectin with 
structural homology to mannose-binding lectin but with a predicted carbohydrate 
15 specificity for galactose" 

ORIGIN 1 malfklflgt lillqfalql Idgaepqnin cpayggvpgt pghngipgrd grdgkdgaig 

61 pkgekgesgv svqgppgkag ppgpagekge rgptgsqgsp gsesvleslk seiqqikaki 
121 atfekvasvg hfrqvgqkyy itdgwgtfd qglkfckdfg gtmvfprtsa enqallklw 
181 ssglsskkpy igvtdreteg rfvntegkql tftnwgpgqp ddykglqdcg viedsglwdd 
20 241 gscgdirpim ceidnk 

SEQ ID NO: 61 

AAK97540 surfactant protein A precursor [Gallus gallus] 
gill 5420996|gb|AAK97540.1 IAF41 1083^1 [15420996] 

25 sig_peptide 1..18 

Region 19..34 /region_name="N-terminal segment" 
Region 35.. 43 /region_name="putative collagen structure- 
Region 44.. 76 /region_name="putative coil structure" 
Region 77.. 97 /region_name="alpha-helical coil-coil structure; neck region" 

30 Region 98. .222 /region_name="carbohydrate recognition domain" 
Site 121 ..123 /site_type="glycosylation" 
Site 181. .183 /site_type="glycosylation" /note="conserved" 
CDS 1..222 /gene="SP-A" /coded_by="AF41 1083.1:61. .729" 
ORIGIN 1 misysfcmia aavalltpch aqncagapel psipgvsgll gigalkryfg sllwpygeek 

35 61 Ipecqwiqrq qdlstssdde Ignvllnlrq rilqiegvia Idgkitkvge kifasngkev 

121 nfssalesce etggtlatpm neeenkaimg ivkqynryay Igikesdtag qfkyvnnqpl 
181 nytswqqyep ngkgtekcve mytdgnwkdr kcnlyrltvc ey 

SEQ ID NO: 62 

40 JN0450 conglutinin precursor - bovine gi|346501 |pir|| JN0450[346501] 

FEATURES Location/Qualifiers source 1..371 /organism="Bos taurus" 
/db_xref="taxon:991 3" 

Protein 1..371 /product="congiutinin precursor" /note="C3b-binding protein" 
45 Region 1.. 20 /region_name="domain"/note="signal sequence" 
Region 21 ..371 /region_name="product" /note="conglutinin" 
Region 46..214 /region_name="region" /note="collagen-like" 
Site 63 /site_type="binding" /note="carbohydrate (Lys) (covalent)" 
Site 63 /siteJype="modified" /note="5-hydroxyly.sine (Lys)" 
50 Region 75.. 371 /region_name="producr /note="conglutinin-N" 
Site 78 /site_type=="modified" /note="4-hydroxyproline (Pro)" 
Site 87 /site__type="binding" /note="carbohydrate (Lys) (covalent)" 
Site 87 /siteJype="modified" /note="5-hydroxy lysine (Lys)" 
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Site 96 /site_type="modified" /note="4-hydroxyproline (Pro)" 
Site 99 /site_type="binding" /note="carbohydrate (Lys) (covalent)" 
Site 99 /siteJype="modified" /note="5-hydroxylysine (Lys)" 
Site 108 /siteJype="modified" /note="4-hydroxyproline (Pro)" 
5 Site 111 /site_type="modified" /note="4-hydroxyproline (Pro)" 
Site 129 /site_type="modified" /note="4-hydroxyproline (Pro)" 
Site 1 32 /site Jype="modified" /note="4-liydroxyproline (Pro)" 
Site 1 35 /site Jype="binding" /note="carbohydrate (Lys) (covalent)" 
Site 1 35 /site_type="modified" /note="5-hydroxylysine (Lys)" 

10 Site 141 /site__type="binding" /note="carbohydrate (Lys) (covalent)" 
Site 141 /site_type="modified" /note="5-hydroxylysine (Lys)" 
Site 147 /sifeJype="modified" /note="4-hydroxyproline (Pro)" 
Site 1 53 /site_type="modifjed" /note="4-hydroxyproline (Pro)" 
Site 159/site_type="binding" /note="carbohydrate (Lys) (covalent)" 

1 5 Site 1 59 /siteJype="rnodifjed" /note="5-hydroxylysine (Lys)" 

Site 162 /site_type="binding" /note="carbohydrate (Lys) (covalent)" 
Site 162/site_type="modified" /note="5-hydroxylysine (Lys)" 
Site 171 /site_type=="modified" /note="4-hydroxyproline (Pro)" 
Site 195 /site_type="modified" /note="4-hydroxyproline (Pro)" 

20 Site 1 98 /site Jype="binding" /note="carbohydrate (Lys) (covalent)" 
Site 1 98 /site Jype="modified" /note="5-hydroxylysine (Lys)" 
Site 21 0 /site Jype="binding" /note="carbohydrate (Lys) (covalent)" 
Site 210 /site Jype="modified" /note="5-hydroxyIysine (Lys)" 
Region 248.. 369 /region_name="domain" /note="C-type lectin homology #label 

25 LCH" 

Site 337 /site_type="binding" /note="carbohydrate (Asn) (covalent)" 

ORIGIN 1 mlllplsvll lltqpwrslg aemttfsqki lanactlvmc splesglpgh dgqdgrecph 
61 gekgdpgspg pagragrpgw vgpigpkgdn gfvgepgpkg dtgprgppgm pgpagregps 

30 121 gkqgsmgppg tpgpkgetgp kggvgapglq gfpgpsglkg ekgapgetga pgragvtgps 
181 gaigpqgpsg argppglkgd rgdpgetgak gesglaevna Ikqrvtildg hlrrfqnafs 
241 qykkavlfpd gqavgekifk tagavksysd aeqicreakg qiasprssae neavtqmvra 
301 qeknayismn distegrfty ptgeilvysn wadgepnnsd egqpencvei fpdgkwndvp 
361 cskqilvice f 

35 . 

SEQ ID NO: 63 

A57250 mannan-binding protein - chicken (fragment) 
gi|1 362725|pir||A57250[1 362725] 

40 FEATURES Location/Qualifiers source 1..30 /organism="Gallus gallus" 

/db_xref="taxon:903r 

Protein 1 ..30 /product="mannan-binding protein" /note="collectin" 
Site 28 /site_type="modified" /note="4-hydroxyproline (Pro)" 
ORIGIN 1 lltcdkpeek myscpiiqcs apavnglpgd 

45 

SEQ ID NO: 64 

A53570 collectin-43 - bovine gill 08301 7|pir||A53570[1 08301 7] 

FEATURES Location/Qualifiers source 1..301 /organism="Bos taurus" 
50 /db_xref="taxon:9913" ■ 

. Protein 1 ..301 /product="collectin-43" /note="lectin CL-43" 

Region 177..299 /region_name="domain" /note="C-type lectin homology #label 
LCH" 
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ORIGIN 1 eemdvysekt Itdpctlwc appadslrgh dgrdgkegpq gekgdpgppg mpgpagregp 
61 sgrqgsmgpp gtpgpkgepg peggvgapgm pgspgpaglk gergapgpgg 
• aigpqgpsga 

' 121 mgppglkgdr gdpgekgarg etsvlevdtl rqrmrnlege vqriqnivtq yrkavlfpdg 
5 181 qavgekifkt agavksysda eqlcreakgq lasprssaen eavtqivrak nkhaylsmnd 

241 iskegkftyp tggsldysnw apgepnnrak degpenclei ysdgnwndie creerlvlce 
301f 

SEQ ID NO: 65 
10 AAF28384 lung surfactant protein A [Sus scrofa] 

gi|6782434|gblAAF28384.1 |AF1 33668_1 [6782434] 

FEATURES Location/Qualifiers source 1..116 /organism="Sus scrofa" 
/db_xref="taxon:9823" 

1 5 Protein <1 1 1 6 /product- 'lung surfactant protein A" /function="involved in the innate 
immune system and lipid homeostasis within the lung" /name="col!ectin; SPA; SP-A" 
CDS 1..116/gene="SFTPA"/coded_by="AF133668.1:<1..353" 
ORIGIN 1 avgekvfstn gqsvafdvir elcaraggri aaprspeene aiasivkkhn tyaylglveg 
61 ptagdffyld gtpvnytnwy pgeprgrgke kcvemytdgq wndmcqqyr laicef 

20 

SEQ ID NO: 66 

AAF22145 lung surfactant protein D precursor; SPD; SP-D; CP4 [Sus scrofa] 
giI6760482|gb|AAF22145.2|AF1 32496^1 [6760482] 

25 slg_peptide 1,.20 

mat ^peptide 21. .378 /product="lung surfactant protein D" 
CDS 1 ..378 /gene="SFTPD" /coded_by="AF1 32496.2:44..1 1 80" 
ORIGIN 1 mlllplsvli lltqpprslg aemktysqra vanacalvmc spmenglpgr dgrdgregpr 
61 gekgdpglpg avgragmpgl agpvgpkgdn gstgepgakg digpcgppgp 
30 pgipgpagke 

121 gpsgqqgnig ppgtpgpkge tgpkgevgal gmqgstgarg paglkgerga pgergap- 

. gsa 

181 gaagpagatg pqgpsgargp pglkgdrgpp gergakgesg Ipgitalrqq vetlqgqvqr 
•241 Iqkafsqykk velfpngrgv gekifktggf ektfqdaqqv ctqaggqmas prseteneal 
35 301 sqlvtaqnka aflsmtdikt egnftyptge pivyanwapg epnnnggssg aencveifpn 

• 361 gkwndkacge Irivicef 

SEQ ID NO: 67 

P41317 MANNOSE-BINDING PROTEIN C PRECURSOR (MBP-C) (MANNAN- 
40 BINDING PROTEIN) (RA-REACTIVE FACTOR P28A SUBUNIT) (RARF/P28A) 
gi|1 346477|splP41 31 7|MABC_MOUSE[1 346477] 

FEATURES Location/Qualifiers source 1 ..244 /organism="Mus museulus" 
/db_xref="taxon:10090" 
45 gene 1.. 244 /gene="MBL2" 

Protein 1 ..244 /gene="MBL2" /product="MANNOSE-BINDING PROTEIN C PRE- 
CURSOR" 

Region 1 ..18 /gene="MBL2" /region_name="SignaI" /note="BY SIMILARITY." 
Region 3 /gene="MBL2" /region_name="Confiict" /ndte="l -> L (IN REF. 1)." 
50 Region 1 5 /gene="MBL2" /region_name="Conf lict" /note="V -> A (IN REF. 1 )." . 

Region 19..244 /gene="MBL2" /re9ion_name=*'Mature qhain" /note="MANNOSE- 
BINDING PROTEIN C." 
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Bond bond(29) /gene="MBL2" /bond type="disumde" /note="INTERCHAIN (BY 
SIMILARITY)." ~ 

Bond bond(34) /gene="MBL2" /bond type="disulfide" /note="INTERCHAIN (BY 
SIMILARITY)." ~ ^ 

Region 38..96 /gene="MBL2" /region_name="Donrialn" /note="G6LLAGEN-LIKE (G- 
" X-Y)." 

Site 43 /gene="MBL2" /site_type="hydroxylation" /note="(POTENTIAL)." 

Site 58 /gene="MBL2" /site_type="hydroxylation" /note="(POTENTIAL)." 

Site 69 /gene="MBL2" /site_type="hydroxylation" /note="(POTENTIAL)." 

Site 78 /gene="MBL2" /site_type="hydroxylation" /note="(POTENTIAL)." 

Site 81 /gene="MBL2" /site_type="hydroxylation" /note="(POTENTIAL)." 

Region 149..242/gene="MBL2" /region name="Domain" /note="C-TYPE LECTIN 

(SHORT FORM)." 

Bond bond(151,240)/gene="MBL2" /bond_type="disulfide"./note="BY SIMILARITY " 
Bond bond(21 8,232) /gene="MBL2" /bond_type="dlsulflde" /note="BY SIMILARITY." 

ORIGIN 1 msiftsflll cwtwyaet Itegvqnscp wtcsspgln gfpgkdgrdg akgekgepgq 
61 girglqgppg kvgptgppgn pglkgavgpk gdrgdraefd tseldseiaa Irselrairn 
121 wvlfslsekv gkkyfvssvk kmsldrvkal csefqgsvat prnaeensai qkvakdiayl 
181 gitdvrvegs fedltgnrvr ytnvwidgepn ntgdgedcw llgngkwndv pcsdsflaic 
241 efsd 

SEQIDNO:68 

P39039 MANNOSE-BINDING PROTEIN A PRECURSOR (MBP-A) (MANNAN- 
BINDING PROTEIN) (RA-REACTIVE FACTOR POLYSACCHARIDE-BINDING 
COMPONENT P28B POLYPEPTIDE) (RARF P28B) 
gl|729972|sp|P39039|MABA_MOUSE[729972] 

FEATURES Location/Qualifiers source 1..239/organism="Mus musculus" 
/db_xref="taxon :1 0090" 
gene 1.. 239 /gene="MBL1" 

Protein t..239/gene="MBL1"/product="MANNOSE-BINDING PROTEIN A PRE- 
CURSOR" 

Region 1..17 /gene="MBL1"/region_name="Signal" /note="BY SIMILARITY." 
Region 18..239 /gene="MBLr /region_name="Mature chain" /note="MANNOSE- 
BINDING PROTEIN A." Region 37..89 /gene=»MBL1" /region name="Domain" 
/note="COLLAGEN-LIKE (G-X-Y)." 

Region 144..239 /gene="MBL1" /region_name="Donnain" /note="C-TYPE LECTIN 
(SHORT FORM)." 

Bond bond(146,235) /gene="MBL1" /bond_type="disulfide" /note="BY SIMILARITY " 
Bond bond(21 3,227) /gene="MBL1" /bond_type="disulfide" /note="BY SIMILARITY." 

ORIGIN 1 mlilpllpvl Icwsvsssg sqtcedtlkt csviacgrdg rdgpkgekge pgqglrglqg 

61 ppgklgppgs vgspgspgpk gqkgdhgdnr aieeklanme aeirilkski qltnklhafs 
• . 121 mgkksgkklf vtnhekmpfs kvkslctelq gtvaipmae enkaiqevat giaflgitde 
181 ategqfmyvt ggrltysnwk kdepnnhgsg edcviildng Iwndiscqas fkavcefpa 

SEQ ID NO: 69 

P42916 COLLECTIN-43 (CL-43) gi|1 168967|sp|P42916|CL43_BOVIN[1 168967] 
FEATURES Location/Qualifiers source 1..301 /organism="Bos taurus" 
/db_xref="taxon:991 3" 
Protein 1..301 /product="COLLECTIN-43" 
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Region 29..142 /region_name="Domain" /note="COLLAGEN-LIKE (G-X-Y)." 
Region 202..301 /region_name="Domain" /note="C-TYPE LECTIN (SHORT 
FORM)/' 

Bond bond(204,299) /bondJype="diqulfide" /note="BY SIMILARITY." 
5 Bond bond(277,291) /bond_type="disulfide" /note="BY SIMILARITY." 

ORIGIN 1 eemdvysekt itdpctlvvc appadslrgh dgrdgkegpq gekgdpgppg mpgpagregp 

61 sgrqgsmgpp gtpgpkgepg peggvgapgm pgspgpaglk gergapgpgg 
aigpqgpsga 

121 mgppglkgdr gdpgekgarg etsvlevdtl rqmnrnlege vqrlqnivtq yrkavlfpdg 
10 181 qavgekifkt agavksysda eqicreakgq lasprssaen eavtqivrak nkhaylsmnd 

241 iskegkftyp tggsldysnw apgepgnrak degpenclei ysdgnwndie creerlvice 
301 f 

SEQ ID NO: 70 
1 5 CAB561 55 DMBT1 /8kb.2 protein [Homo sapiens] 
gi|5912464|emb|CAB56155.1 1[5912464] 
sig_peptide 1..26 

mat_peptide 26.. 241 2 /product="DMBT1/8kb.2 protein" 

CDS 1..2412 /gene="DMBT1" /coded_by="AJ243212.1:107..7345" 

20 /note="Sequence is an alternative splice form of the DMBT1 gene that is expressed 
in human adult trachea. Isoforms of DMBT1 are identical to the collectin binding 
protein gp-340. Full-length cDNA clone contains 1 bp deletions in codons 100 and 
1 751 , that were corrected by comparison with the genomic exons" 
ORIGIN 1 mgistvilem cllwgqvlst ggwiprttdy aslipsevpl dttvaegspf pseltlestv 

25 61 aegspisles tiettvaegs lipsestles tvaegsdsgl alrlvngdgr cqgrveilyr 

121 gswgavcdds wdtndanwc rqigcgwams apgnawfgqg sgpialddvr csghe- 

sylws 

181 cphngwishn cghgedagvi csaaqpqsti rpeswpvris ppvptegses slalrlvngg 
241 drcrgrvevl yrgswgtvcd dywdtndanv vcrqigcgwa msapgnaqfg qgsgpividd 
30 301 vrcsghesyl wscphngwit hncghsedag vicsapqsrp tpspdtwpts hastagpess 

361 lalrlvnggd rcqgn/evly rgswgtvcdd swdtsdanvv crqigcgwat sapgnarfgq 
421 gsgpivlddv rcsgyesylw scphngwish ncqhsedagv icsaahswst pspdtlptit 
481 Ipastvgses slalrlvngg drcqgrvevi yrgswgtvcd dswdtndanv vcrqigcgwa 
541 mlapgnarfg qgsgpividd vrcsgnesyl wscphngwis hncghsedag vicsgpessi 
35 601 alrlvnggdr cqgnyevlyr gswgtvcdds wdtndanwc rqigcgwams apgnarfgqg 

661 sgpivlddvr csghesylws cpnngwishn cghhedagvl csaaqsrstp rpdtlstiti 
721 ppstvgsess Itlrlvngsd rcqgrvevly rgswgtvcdd swdtndanw crqigcgwat 
781 sapgnarfgq gsgpivlddv rcsghesylw scphngwish ncghhedagv icsvsqsrpt 
841 pspdtwptsh astagpessi alrlvnggdr cqgrvevlyr gswgtvcdds wdtsdanwc 
40 901 rqigcgwats apgnarfgqg sgpivlddvr csgyesylws cphngwishn cqhsedagvi 

961 csaahswstp spdtlptiti pastvgsess lalrlvnggd rcqgrvevly qgswgtvcdd 
1021 swdtndanw crqigcgwam sapgnarfgq gsgpivldda rcsghesylw scphngwish 
1081 ncghsedagv icsasqsrpt pspdtwptsh astagsessi alrlvnggdr cqgrvevlyr 
1 141 gswgtvcddy wdtndanvac rqigcgwams apgnarfgqg sgpivlddvr csghesylws 
45 1201 cphngwishn cghhedagvi csasqsqptp spdtwptsha stagsessia Irlvnggdrc 

1261 qgrvevlyrg swgtvcddyw dtndanvvcr qigcgwatsa pgnarfgqgs gpivlddvrc 
1 321 sghesylwsc phngwishnc ghhedagvic sasqsqptps pdtwptshas tagsesslal 
1381 rlvnggdrcq grvevlyrgs wgtvcddywd tndanvvcrq Igcgwatsap gnarfgqgsg 
1441 pivlddvrcs ghesylwscp hngwlshncg hhedagvics afqsqptpsp dtwptsrast 
50 . 1501 agsestlair Ivnggdrcrg rvevlyqgsw gtvcddywdt ndanvvcrql gcgwamsapg 

1561 naqfgqgsgp ivlddvrcsg hepylwscph ngwishncgh h^dagvicsa aqsqstprpd 
1621 twittnlpal tvgsesslal rivnggdrcr grvevlyrgs wgtvcddswd tndanwcrq 
1681 Igcgwamsap gnarfgqgsg pivlgdvrcs gnesylwscp hkgwithncg hhedagvics 
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1741 atqinstttd wwhpttttta rpssncggfl fyasgtfssp sypayypnna kcvweievns 
1801 gyrinlgfsn Ikleahhncs fdyveifdgs Insslllgki cndtrqifts synrmtihfr 
1861 sdisfqntgf lawynsfpsd atlrlvnins syglcagrve iyhggtwgav cddswtiqea 
1921 evvcrqigcg ravsalgnay fgsgsgpiti ddvecsgtes tlwqcrnrgw fshncnhrpd 
5 " 1981 agvicsgnhl stpapflnit rpnnyscggf Isqpsgdfss pfypgnypnn akcvwdievq 

2041 nnyrvtvifr dvqieggcny dyievfdgpy rsspllarvc dgargsftss snfmsirfis 
2101 dhsitrrgfr aeyysspsnd stnllclpnh mqasvsrsyl qslgfsasdl vistwngyye 
2161 crpqitpniv iftipysgcg tfkqadndti dysnlltaav sggiikrrtd IrihvscrmI 
2221 qnhAA/dtmyi andtihvann tiqveevqyg nfdvnisfyt sssflypvts rpyyvdinqd 
10 2281 lyvqaeilhs davltlfvdt cvaspysndf tsltydlirs gcvrddtygp ysspslriar 

2341 frfrafhfln rfpsvylrck mwcraydps srcyrgcvir skrdvgsyqe kvdwigpiq 
2401 Iqtpprreee pr 

SEQ ID NO: 71 

15 BAA81747 collectln 34 [Homo sapiens] gi|5162875|dbj|BAA81747.1|[5162875] 
FEATURES Location/Qualifiers source 1 ..277 /organism="Homo sapiens" 
/db_xref="taxon:9606" 
Protein 1 ..277 /product="collectin 34" 
CDS 1..277 /coded_by="AB002631.1:6..839" 
20 ORIGIN 1 mngfaslirr nqfillvlfl Iqiqslgldi dsrptaevca thtispgpkg ddgekgdpge 

61 egkhgkvgrm gpkgikgeig dmgdrgnigk tgpigkkgdk gekgllgipg ekgkagtvcd 
121 cgryrkfvgq Idisiarlkt smkfvknvia gireteekfy yivqeeknyr eslthcrirg 
181 gmlampkdea antliadyva ksgffrvfig vndleregqy mftdntplqn ysnwnegeps 
241 dpyghedcve missgrwndt echltmyfvc efikkkk 

25 

SEQ ID NO: 72 

AAB94071 mannan-binding lectin; collectin [Gallus gallus] 
gi|2736145|gb|AAB94071.1|[2736145] 

FEATURES Location/Qualifiers source 1..238 /organism="Gallus gallus" 
30 /strain="White Leghorn" /db_xref="taxon:9031 " /tissue Jype="liver" • 

Protein 1 ..>238 /product="mannan-binding lectin" /name="c-type lectin" 
/note-'mannan-binding protein; MBP; mannose-binding protein; MBL; collectin" 
CDS 1 ..238 /gene="cMBI" /coded_by="AF022226.1 :1 ..>714" 
ORIGIN 1 mmatsllttd kpeekmyscp iiqcsapavn gipgrdgrdg pkgekgdpge girglqglpg 
35 61 kagpqglkge vgpqgekgqk gergivvtdd Ihrqitdlea kirvleddls rykkalslkd 

121 vvnigkkmfv stgkkynfek gkslcakags vlasprneae ntalkdiidp ssqayigisd 
181 aqtegrfmyl sggpltysnw kpgepnnhkn edcaviedsg kwndldcsns nifiicel 

SEQ ID NO: 73 

40 /^B36019 nnannan-binding protein, MBP=lectin {N-terminal} [chickens, serum, 
Peptide Partial, 30 aa] [Gallus gallus] gi|1311692|gb|AAB36019.1|[1311692] 
FEATURES Location/Qualifiers source 1..30 /organism="Gallus gallus" 
/db_xref="taxon:9031 " 

Protein 1 ..30 /partial /product="mannan-binding protein" /name="Iectin" /note="MBP" 
45 ORIGIN 1 lltcdkpeek myscpiiqcs apavnglpgd 

SEQ ID NO: 74 

/V^B27504 conglutinin (N) {N-terminal} [cattle, Peptide Partial, 60 aa] [Bos taurus] 
gi|386660|gb|/\AB27504.1 1[386660] 
50 FEATURES Location/Qualifiers source 1..60 /organism="Bos taurus" 
/db_xref="taxon:991 3" 

Protein 1 ..60 /partial /product="conglutinin (N)" . 

ORIGIN 1 aemttfsqki lanactlvmc splesglpgh dgqdgrecph gekgdpgspg pagragrpgw 
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SEQIDNO:75 

CAA5351 1 collectin-43 [Bos taurus] gi|499385|emb|CAA5351 1 .1 1[499385] 
FEATURES Location/Qualifiers source 1..301 /organism="Bos taurus" 
5 /db_xref="taxon:991 3" /tissue_type="liver" /cloneJib="lambda gt 1 r 
Protein 1..301 /product="collectin-43" 
matjDeptide 1..301 /product="coIlectin-43" 

CDS 1 ..301 /coded_by="X7591 2.1 :<1 ..906" /db_xref="SWISS-PROT:P42916" 
ORIGIN 1 eemdvyxekt ltdpctlwc appadslrgh dgrdgkegpq gekgdpgppg mpgpagregp 
10 61 sgrqgsmgpp gtpgpkgepg peggvgapgm pgspgpaglk gergapgpgg 

aigpqgpsga 

1 21 mgppglkgdr gdpgekgarg etsvlevdtl rqrmrnlege vqriqnivtq yrkavlf pdg 
181 qavgekifkt agavksysda eqicreakgq lasprssaen eavtqivrak nkhaylsmnd 
241 iskegkttyp tggsldysnw apgepgnrak degpenclei ysdgnwndie creerlvice 
15 301 f 

SEQ ID NO: 76 

AAA82010 mannose-binding protein C [Mus musculus] ( 
gi|773288|gb|AAA82010.1|[773288] 
20 FEATURES Location/Qualifiers source 1..244 /organism="Mus nnusculus" 

/strain="BALB/c"/db_xref="taxon:10090"/clone="Lannbda 14 and 52; CosllA" 
/clone Jib="NIH/3T3 Swiss mouse embryo cell line and BALB/c pWE15 cosmid li- 
brary" 

Protein 1 ..244 /product="mannose-binding protein C" 
25 Site 1 ..59 /site_type="signal-peptide" /note="signal-peptide and collagen-like region" 
mat_peptide <59..>98 /product="mannose-binding protein C" /note="collagen-like 
domain" 

mat_peptide <98..>121 /product="mannose-binding protein C" /note="linking-peptide 
domain" 

30 mat_peptide <1 21 ..244 /product="carbohydrate recognition domain" 

CDS 1..244 /gene="Mbl2" /coded_by="join(U09013.1:470..644,U09014.1:43..159, 
U0901 5. 1 :97. . 1 65, U0901 6. 1 :576..949)" 

ORIGIN 1 msiftsflll cwtvvyaet Itegvqnscp \A/lcsspgln gfpgkdgrdg akgekgepgq 
61 girglqgppg kvgptgppgn pglkgavgpk gdrgdraefd tseidseiaa Irselralrn 
35 121 wvlfslsekv gkkyfvssvk kmsldrvkal csefqgsvat prnaeensai qkvakdiayl 

1 81 gitdvrvegs fedltgnrvr ytnwndgepn ntgdgedcw ilgngkwndv pcsdsflaic , 
241 efsd 

SEQ ID NO: 77 

40 AAA82009 mannose-binding protein A [Mus musculus] 
gil7732801gbIAAA82009.1 1[773280] 

sig_peptide 1..18 

mat_peptide 1 9.. 239 /product="unnamed" 
45 mat_peptide 19..>52 /product="mannose-binding protein A" /note="collagen-like 
region" 

mat_peptide <52..>91 /product="mannose-binding protein A" /note="colIagen-like 
domain" 

mat_pept4de <91..>116 /produQt="mannose-binding protein A" /note="linking-peptide 
50 domain" 

CDS 1..239 /gene="Mbl1" /coded_by="join(U09007.1:275..428.U09.008.1:287..403, 
U09009.1:166..240.U09010.1:78..451)" 
. ORIGIN 1 mlllpllpvl Icvvsvsssg sqtcedtlkt csyiacgrdg rdgpkgekge pgqglrglqg 
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61 ppgklgppgs vgspgspgpk gqkgdhgdnr aieeklanme aeirilkskl qlthklhafs 
121 mgkksgkklf vtnhekmpfs kvkslctelq gtvaiprnae enkaiqevat giaflgitde 
181 ategqfmyvfggrltysnwk kdepnnhgsg edcviildng Iwndiscqas fkavcefpa 



Lung surfactant protein 

SEQ ID NO: 78 

P35247 Pulmonary surfactant-associated protein D precursor (SP-D) (PSP-D) 
gi|464486|sp|P35247|PSPD_HUMAN[464486] 

FEATURES Location/Qualifiers source 1..375 /organism="Homo sapiens" 
/db_xref="taxon:9606" 

gene 1 ..375 /gene="SFTPD" /note="SFTP4; PSPD" 

Protein 1..375 /gene="SFTPD" /product="Pulmonary surfactant-associated protein D 
precursor" 

Region 1 ..20 /gene="SFTPD" /region_name="Signal" /note="BY Sli\/IILARITY " 
Region 21 ..375 /gene="SFTPD" /region_name="IVIature chain" 
/note="PULMONARY SURFACTANT-ASSOCIATED PROTEIN D." 
Region 31 /gene="SFTPD" /region_name="Conflict" /note="M -> T (IN REF. 2)." 
Region 46..222 /gene="SFTPD" /region_name="Domain" /note="COLLAGEN-LIKE." 
Region 59 /gene="SFTPD" /region_name="Conflict" /note="P -> F (IN REF. 3)." 
Site 78 /gene="SFTPD" /site_type="hydroxyiation" /note="(BY SIMILARITY)." 
Site 87 /gene="SFTPD" /site_type="hydroxylation" /note="(BY SIMILARITY)." 
Site 90 /gene="SFTPD" /site_type="glycosylation" /note="N-LINKED (GLCNAC ) 
(POTENTIAL)." ' 
Site 96 /gene="SFTPD" /site_type="hydroxylation" /note="(BY SIMILARITY)." 
Site 99 /gene="SFTPD" /slte_type="hydroxylatlon" /note="(BY SIMILARITY)." 
Region 122 /gene="SFTPD" /region_name="Conflict" /note="A -> P (IN REF. 2)." 
Site 171 /gene="SFTPD" /site_type="hydroxylation"/note="(BY SIMILARITY)." 
Site 177 /gene="SFTPD" /site_type="hydroxylation" /note="(BY SIMILARITY)." 
Region 180 /gene!="SFTPD" /region_name="Conflicr /note="T -> A (IN REF. 2)." 
Region 206 /gene="SFTPD" /region_name="Conflict" /note="D -> P (IN REF. 3)." 
Region 223.. 252 /gene="SFTPD" /region nanne="Domaln" /note="COILED COIL 
(POTENTIAL)." 

Region 227..253 /gene="SFTPD" /region_name="Helical region" 

Region 254.. 256 /gene="SFTPD" /region_name="Hydrogen bonded turn" 

Region 257..260 /gene="SFTPD" /region_nanne="Beta-strand region" 

Region 261..262/gene="SFTPD" /region_name="Hydrogen bonded turn" 

Region 263.. 272 /gene="SFTPD" /region_name="Beta-strand region" 

Region 274..283 /gene="SFTPD" /region_name="Helical region" 

Region 279..375 /gene="SFTPD" /region_name="Domain" /note="C-TYPE LECTIN 

(SHORT FORM)." 

Bond bond(281,373) /gene="SFTPD" /bond_type="disulfide" 
Region 284..285 /gene="SFTPD" /region_name="Hydrogen bonded turn" 
Region 287.. 288 /gene="SFTPD" /region_name="Beta-strand region" 
Region 294..307 /gene="SFTPP" /region_name="Helical region" 
Region 308 /gene="SFTPD" /region_name="Hydrogen bonded turn" 
Region 311. .316 /gene="SFTPD" /region_name="Beta-strand region" 
Region 321.. 322 /gene="SFTPD" /region_name="Hydrogen bonded turn" 
Region 325 /gene="SFTPD" /region_nanr»e="Beta-strand region" 
Region 327..328 /gene="SFTPD" /region_name="Hydrogen bonded turn" 
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Region 331 /gene="SFTPD" /region_name="Beta-strand region" 
Region 337 /gene="SFTPD" /region_name="Beta-strand region" 
Region 339.-.340 /gene="SFTPD" /region_nanne="Hydrogen bonded turn" 
Region 345..347 /gene="SFTPD" /region_name="Helical region" 
5 Bond bond(351 ,365) /gene="SFTPD" /bond_type=''disulfide" 

Region 351..354/gene="SFTPD" /region_hame="Beta-strand region" 
Region 356..357 /gene="SFTPD" /region_name="Hydrogen bonded turn"- 
Region 360..363 /gene="SFTPD" /region_nanne="Beta-strand region" 
Region 365.. 366 /gene="SFTPD" /region_name="Hydrogen bonded turn" 
10 Region 369..375 /gene="SFTPD" /region_name="Beta-strand region" 

Region 374 /gene="Si=TPD" /region_name="Conflict" /note="E -> EH (IN REF. 3)." 
ORIGIN 1 mllfllsalv lltqplgyle aemktyshrt mpsactlvmc ssvesgipgr dgrdgregpr 
61 gekgdpglpg aagqagmpgq agpvgpkgdn gsvgepgpkg dtgpsgppgp 
pgvpgpagre 

15 121 galgkqgnig pqgkpgpkge agpkgevgap gmqgsagarg lagpkgergv 

pgergvpgnt 

181 gaagsagamg pqgspgargp pglkgdkgip gdkgakgesg Ipdvaslrqq veaiqgqvqh 
241 Iqaafsqykk velfpngqsv gekifktagf vkpfteaqll ctqaggqias prsaaenaal 
301 qqiwaknea aflsmtdskt egkftyptge sivysnwapg epnddggsed cveiftngkw 
20 361 ndracgekri wcef 

SEQ ID NO: 79 " 
NP_002395 microfibrillar-associated protein 4; mioroflbril-associated glycoprotein 4 
[Homo sapiens] gi|231 1 1005|reflNP_002395.1 1[231 1 1005] 

25 „ 
FEATURES Location/Qualifiers source 1..255/organism="Homo sapiens 
/db_xref="taxon:9606" /ciiromosome="17" /map="17p11.2" 
Protein 1..255 /product="microfibrillar-associated protein 4" /note="microfibril- 
associated glycoprotein 4" 

30 Region 36.. 255 /region_name="smart001 86, FBG. Fibrinogen-related domains 
(FReDs); Domain present at the C-termini of fibrinogen beta ahd gamma chains, 
and a variety of fibrinogen-related proteins, including tenascin and Drosophila 
scabrous" 

Region 38..254 /region_name="pfam00147, fibrinogen_C, Fibrinogen beta and 
35 gamma chains. C-tenninal globular domain" 

CDS 1 . .255 /gen6="IVIFAP4" /coded_by="NM_002404. 1 :26..793" 
/db_xi-ef="LocuslD:4239"/db_xref="MliVI:600596" 

ORIGIN 1 mkallalpll lllstppcap qvsgirgdal erfclqqpid cddiyaqgyq sdgvyliyps 

61 gpsvpvpvfc dmtteggkwt vfqkrfngsv sffrgwndyk Igfgradgey wiglqnmhil 
40 121 tikqkyelrv diedfennta yakyadfsis pnavsaeedg ytlfvagfed ggagdslsyh 

1 8.1 sgqkfstfdr dqdlfvqnca alssgafwfr schfahlngf ylggshlsya riginwaqwkg 
241 fyyslkrtem kirra 

t 

SEQ ID NO: 80 

45 1 KMRA Chain A, Solution Nmr Structure Of Surfactant Protein B (1 1-25) (Sp- B1 1- 
25).gi|2221 9056|pdbl1 KMR|A[2221 9056] 

FEATURES Location/Qualifiers source 1..15 /organism="Homp sapiens" 
/db_xref="taxon:9606" ' 
50 SecStr3..11 /sec_str_type="helix"/note="hellx1" 
ORIGIN 1 crallkriqa mipkg 

SEQ ID NO: 81 
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P50404 Pulmonary surfactant-associated protein D precursor (SP-D) (PSP-D) 
gi|1769879|sp|P50404|PSPD_MOUSE[1 709879] 

FEATURES Location/Qualifiers source 1..374 /organism="Mus musculus" 
/db_xref="taxon:1 0090" 
5 gene 1 ..374 /gene="SFTPD" /note="SFnP4" 

Protein 1..374/gene="SFTPD"/product="Pulmonary surfactant-associated protein D 
precursor" 

Region 1 ..1 9 /gene="SFTPD" /region_name="Signal" /note="BY SliVIILARITY." 
Region 20..374/gene=''SI=TPD"/region_name="l\/lature ctiain" 
1 0 /note="PULIVlONARY SURFACTANT-ASSOCIATED PROTEIN D." 

Region 45..221 /gene="SFTPD" /region_name="Domain'' /note="COLLAGEN-LIKE." 
Site 89 /gene="SFTPD" /site_type="glycosylation" /note="N-LINKED (GLCNAC.) 
(POTEIVITIAL)." 

Region 222..253 /gene="SFTPD" /region_name="Domain" /note="COILED COIL 
15 (POTENTIAL)." 

Region 278;.374 /gene="SFTPD" /region_nanne="Domain" /note="CrTYPE LECTIN 
(SHORT FORM)." 

Bond bond(280.372) /gene="SFTPD" /bondjtype="disu[fide" /note="BY 
SIMILARITY." 

20 . Bond bond(350,364) /gene="SFTPD" /bond_type="disulfide" /note="BY 
SIMILARITY." 

ORIGIN 1 mlpflsmlvl Ivqplgnlga emkslsqrsv pntctlvmcs ptenglpgrd grdgregprg 

61 ekgdpglpgp mglsglqgpt gpvgpkgeng sagepgpkge rglsgppglp gipgpagkeg 
121 psgkqgnigp qgkpgpkgea gpkgevgapg mqgstgakgs tgpkgergap 
25 gvqgapgnag ' 

181 aagpagpagp qgapgsrgpp glkgdrgvpg drgikgesgl pdsaalrqqm eaikgklqrl 
241 evafshyqka alfpdgrsvg dkifrtadse kpfedaqemo kqaggqiasp rsatenaaiq 
301 qlitahnkaa fismtdvgte gkftyptgep Ivysnwapge pnnnggaenc veiftngqwn 
361 dkacgeqriv icef 

30 

SEQIDNO:82 

P06908 Pulmonary surfactant-associated protein A precursor (SP-A) (PSP-A) 
(PSAP) giji 1 72693|sp|P06908|PSPA_CANFA[1 1 72693] 

35 FEATURES Location/Qualifiers source 1 ..248 /organism="Canis ifamiliaris" 
/db_xref="taxon:961 5" 

gene 1..248 /gene="SFTPA1" /note="SFTPA; SFTP1" 

Protein 1..248 /gene="SFTPA1" /product="Pulmonary surfactant-associated protein 
A precursor" 

40 Region 1.;1 7 /gene="SFTPA1"/region_name="Signar 

Region 18.. 248 /gene="SFTPA1" /region_name="Mature chain" 

/note="PULMONARY SURFACTANT-ASSOCIATED PROTEIN A." 

Site 20 /gene="SFTPA1" /site_type="glycosylation" /note="N-LINKED (GLCNAC.) 

(POTENTIAL)." 

45 Region 28.. 1 00 /gene="SFTPAr' /regiori_name="Domaln" /note="COLLAGEN- 
LIKE." 

• Region 153..248 /gene="SFTPA1" /region_name="Domain" /note="C-TYPE LECTIN 
(SHORT FORM)." 

Bond bond(1 55.246) /gene="SFTPAr' /bond_type="disulfide" /note="BY 
50 SIMILARITY." 

Site 207 /gene="SFTPA1 " /site_type="glycosylation" /note="N-LINKED (GLCNAC.) 
(PROBABLE)." . 
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Bond bond(224.238) /gene="SFTPA1" /bond_type="disulfide" /note="BY 
SIMILARITY." 

ORIGIN 1 mwlrclalal tilmvsglen ntkdvcvgnp gipgtpgshg Ipgrdgrdgv kgdpgppgpl 

61 gppggmpghp gpngfntgapg vagergekge pgergppglp asldeelqtt Ihdirhqilq 
5 121 tmgvlslhes llwgrkvfs snaqsinfnd iqelcagagg qiaapmspee neavaslvkk 

181 yntyaylglv espdsgdfqy mdgapvnytn wypgeprgrg keqcvemytd gqwnnknclq 
241 yrlalcef 

SEQ ID NO: 83 

10 PI 2842 Pulmonary surfactant-associated protein A precursor (SP-A) (PSP-A) 
(PSAP) gi|131413|sp|P12842|PSPA_RABIT[131413] 

FEATURES Location/Qualifiers source 1 ..247 /organism-'Oryctolagus cuniculus" 
/db_xref="taxon:9986" 
15 gene 1.. 247 /gene="SFn"PA1" /note="SFTPA; SFTP1" 

Protein 1..247 /gene="SFTPA1" /product="Pulmonary surfactant-associated protein 
A precursor" 

Region 1..15 /gene="SFTPA1" /region_name="Signar /note="POTENTIAL'' 
Region 1 2 /gene="SFTPA1 " /region_name="Variant" /note="S -> P." 
20 Region 1 6..247 /gene="SFTPA1 " /region_name="l\/lature chain" 

/note="PULMONARY SURFACTANT-ASSOCIATED PROTEIN A." 
Region 27..99 /gene="SFTPAr /region_name="Domain" /note="COLLAGEN-LIKE." 
Region 57..60 /gene="SFTPAr /region_name="Conflict" /note="GPMG -> APWA 
(IN REF. 2)." 

25 Region 1 52..247 /gene="SFTPA1" /region_name="Domain" /note="C-TYPE LECTIN 
(SHORT FORM)." 

Bond bond( 154,245) /gene="SFTPA1" /bond_type="disulfide" /note="BY 
SIMILARITY." 

Site 206 /gene="SFTPA1" /slte_type="glycosylation" /note="N-LINKED (GLCNAC.) 
30 (PROBABLE)." 

Bond bond(223.237) /gene="SFTPA1" /bond_type="disulfide" /note="BY 
SIMILARITY." 

ORIGIN 1 mlllslaltl isapasdtcd ti<dvcigspg ipgtpgshgl pgrdgrdgvk gdpgppgpmg 

61 ppggmpglpg rdgligapgv pgergdkgep gergppglpa yideelqati helrhhalqs 
35 121 igvlslqgsm kavgekifst ngqsvnfdai revcaraggr iavprsleen eaiasivker 

181 ntyaylglae gptagdfyyl dgdpvnytnw ypgeprgqgr ekcvemytdg kwndknclqy 
241 rivicef 

SEQ ID NO: 84 

40 NP_0331 86 surfactant associated protein D {Mus musculusl 
gi|6677921 |ref|NP_0331 86.1 1[6677921] 

sig_peptide 1...19 

mat_peptide 20.. 374 /product="surfactant associated protein D" 
45 Region 260.. 373 /region_name="C-type lectin (CTL) or carbohydrate-recognition 
domain (CRD)" /note="CLECT" /db_xref="CDD:smart00034" . 
Region 271 ..374 /region_name="Lectin C-type domain" /note="lectin_c" 
/db_xref="CDD:pfam00059" 

CDS 1..374 /gene="Sftpd" /coded_by="NM_009160.1:43..1167" 
50 /db_xref="LocusID:20390" /db_xref="MGD: 1 0951 5" 

ORIGIN 1 mlpflsmlvl Ivqplgnlga emkslsqrsv pntctlvmcs ptenglpgrd grdgregprg . 

61 ekgdpglpgp mglsglqgpt gpvgpkgeng sagepgpkge rglsgppglp gipgpagkeg 
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121 psgkqgnigp qgkpgpkgea gpkgevgapg mqgstgakgs tgpkgergap 

gvqgapgnag 

181 aagpagpagp qgapgsrgpp glkgdrgvpg drgikgesgl pdsaalrqqm eaikgklqrl 
241 evafshyqka alfpdgrsvg dkifrtadse kpfedaqemc kqaggqiasp rsatenaaiq 
5 301 qlitahnkaa fismtdvgte gkftyptgep Ivysnwapge pnhnggaenc veiftngqwn 

361 dkacgeqriy icef 

SEQIDNO:85 

1 B08C Chain C, Lung Surfactant Protein D (Sp-D) (Fragment) 
1 0 gi|6573321 |pdb|1 B08|C[6573321] 

FEATURES Location/Qualifiers source 1..158/organism="Homo sapiens" 
/db_xref="taxon:9606" 

SecStr 13. 36 /sec_str_type="helix" /note="helix 7" 
15 Region 38.. 158 /region_name="Domain 3" /note="NCBI Domains" 

SecStr 39.. 44 /sec_str_type="sheet" /note="strand 21" 

SecStr 45..51 /sec_str_type="sheet" /note="strand 22" 

SecStr 53..56 /sec_str_type="sheet" /note="strand 23" 

SecStr 57..67 /sec_str_type=" helix" /note="helix 8" 
20 Bond bond(64, 1 56) /bond_type="disulfide" 

SecStr 77.. 90 /sec_str_type="helix" /note="helix 9" 

SecStr 93..96 /sec_str_type="sheer /note="strand 24" 

Hetjoin(bond(100),bond(100),bond(100),bond(104),bond(104), 

bond(104),bond(127),bond(132),bond(133)) /heterogen="( CA, 8 )" 
25 Het join(bond(1 04),bond(1 33),bond(1 33),bond(1 33)) /heterogen="( CA, 9 )" 

SecStr 107..110 /sec_str_typ.e="sheef' /note="strand 25" 

SecStr 1 12..1 15 /sec_str_type="sheet" /nGte="strand 26" 

Hetjoin(bond(124),bond(126);bond(132).bond(144),bond(145), 

bond(145),bond(145).bond(145),bond(145),bond(145), 
30 bond(145),bond(145),bond(145),bond(145),bond(145), 

bond(145),bond(145),bond(145),bond(145),bond(145), 

bond(145).bond(145).bond(145),bond(145),bond(145), bond(145)) /heterogen="( 
CA, 7 )" 

SecStr 133.. 139 /sec_^str_type="sheet" /note="strand 27" 
35 Bond bond(1 34,148) /bond_type="disulfide" 

SecStr 141. .147 /sec_str_type="sheer /note="strand 28" 

SecStr 150..158 /sec_str_type="sheet" /note="strand 29" 

ORIGIN 1 eaeagsvasi rqqvealqgq vqhiqaafsq ykkveffpng qsvgekifkt agfvkpftea 
61 qllctqaggq lasprsaaen aalqqivvak neaaflsmtd sRtegkftyp tgeslvysnw 
40 121 apgepnddgg sedcveiftn gkwndracge kriwcef 

SEQ ID NO: 86 

1B08B Chain B, Lung Surfactant Protein D (Sp-D) (Fragment) 
gi|6573320|pdb|1 B08|B[6573320] 

45 

FEATURES Location/Qualifiers source 1..158 /organism="Homo sapiens" 
/db_xref="taxon:9606" 

SecStr 1 1 ..34 /sec_str_type="heiix" /note="helix 4" 
Region 37.. 158 /region_name="Domain 2" /note="NCBI Domains" 
50 SecStr 39..44 /sec_str_type="sheet" /note="strand 11" 
SecStr 45..51 /sec_str_type="sheet" /note="strand 12" 
SecStr 53.-56 /sec_str__type="sheet" /note="strand 13" 
SecStr 57..67 /sec_str_type="helix" /note="helix 5" 
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Bond bond(64,156) /bond_type="disulfide" 
SecStr 77..90 /sec_str_type="helix" /note="helix 6" 
SecStr 93..96 /sec_str_type="sheet" /note="strand 14" 
• SecStr 97.. 100 /sec_str_lype="sheet" /note="strand 15" 
5 Hetjoin(bond(100),bond(100),bond(100),bond(104),bond(104), 

' bond(104),bond(127),bond(132),bond(133)) /heterogen="( CA, 5 )" 
Het join(bond(104),bond(133),bond(133),bond(133)) /heterogen="( CA. 6 ) 
SecStr 1 07.. 1 1 0 /sec_str_type="sheet" /note="strand 16" ^ 
Het join(bond(1 24).bond(1 26).bond(1 32),bond(144),bond(1 45), bond(145)) 
10 /heterogen="( CA, 4 )" 

SecStr 1 33.. 1 39 /sec_str_type="sheer /note="strand 1 7 
Bond bond(134,148) /bond_type="disulfide" 
SecStr 141..147 /sec_str_type="sheet" /note="strand 18" • 
SecStr 1 50.. 1 53 /sec_str_type="sheet" /note="strand 1 9" 
15 SecStr 154..158 /sec_str_type="sheet" /note="strand 20" 

ORIGIN 1 eaeagsvasi rqqvealqgq vqhiqaafsq ykkyelfpng qsvgekifkt agfvkpftea 
61 qllctqaggq lasprsaaen aalqqiwak neaaflsmtd sktegkftyp tgeslvysnw 
121 apgepnddgg sedcveiftngkwndracge kriwcef - 

20 SEQlDNO:87 ,^ \^ 

1 B08A Chain A, Lung Surfactant Protein D. (Sp-D) (Fragment) 
gil657331 9|pdb j1 B08|At657331 9] 

FEATURES Location/Qualifiers source 1 ..158 /orgamsm="Homo sapiens" 
25 /db_xref="taxon:9606" 

SecStr 10.. 36 /sec_str_type="henx" /note="helix 1" 

Region 38..1 58 /region_name="Domain 1" /note=''NCBI Domains" 

SecStr 39.. 44 /sec_str_type="sheet" /note="strand 1" 

SecStr 45..51 /sec_str_type="sheet" /note="strand 2" 
30 SecStr 53.. 56 /sec_str_type="sheet" /note="strand 3" 

SecStr 57.-67 /sec_str_type="helix" /note="helix 2" 

Bond bond(64.156) /bond_type="disu!fide" 

SecStr 77.. 90 /sec_str_type="heiix" /note="helix 3" 

SecStr 93.. 96 /sec_str_type="sheet" /note="strand 4" 
35 SecStr 97.. 1 00 /sec_str_type="sheet" /note="strand 5!' 

Het join(bond(1 00),bond(1 00),borid(1 00),bond(1 04),bond(1 04). 

bond(104).bond(127),bond(132),b.ond(133)) /heterogen="( CA, 2 )" 

Het join(bond(1 04).bond(1 33).bond(1 33).bond(1 33)) /heterogen= ( CA, 3 ) 

SecStr107..110/sec_str_type="sheet"/note="strand6" 
40 Hetjoin(bond(124).bond{126),bond(132),bond(144),bond(145), bond(145)) 

/heterogen="( CA. 1 )" - 
SecStr 1 33 .. 1 39 /sec_str_ty pe="sheet" /note="strand 7 
Bond bond(1 34.148) /bond_type="dlsulfide" 
SecStr 141 ..147 /sec_str_type="sheet" /note-'strand 8" 
45 SecStr 150..153/sec_str_type="sheet"/note="strand 9" 
SecStr 154.. 158 /sec_str_type="sheet" /nbte="strand 10" 
ORIGIN 1 eaeagsvasi rqqvealqgq vqhiqaafsq ykkvelfpng qsvgekifkt agfvkpftea 
61 qllctqaggq lasprsaaen aalqqiwak neaaflsmtd sktegkftyp tgeslvysnw 
121 apgepnddgg sedcveiftn gkwndracge kriwcef 

50 

SEQIDNO:88 „. . . 

NP_060049 deleted in malignant brain tumors 1 isoform c precursor [Homo sapiens] 
gi|8923740|ref|NP_060049.1 1[8923740] 
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rrSt^eptide 26..2403 /product="deleted in malignant brain tumors 1 isoform c 
Region 102..202 /region_name="Scavenger receptor Cys-rich /note= SR 

5 /db xref="CDD:SR" ^ ^ . ^.^KH™in" 

Region 105..202 /region_name="Scavenger receptor cysteine-nch domain 

/note="SRCR" /db_xref="CDD:pfam00530" ■ ^„ , , _..od.. 

Region 234..334 /region_name="Scavenger receptor Cys-nch /note^ SR 

1 0 R^ion 237.°33"4 /reglon_name="Scavenger receptor cysteine-ricli domain" 
/note="SRCR"/db_xref="CDD:pfam00530" . u.. , » -cd» 

Region 363..463 /region_name="Scavenger receptor Cys-ncli /note= SK 

/db xref="CDD:SR" . . ^ 

Region 366..463 /reglon_name="Scavenger receptor cysteine-rich domain 
15 /note="SRCR"/db_xref="CDD:pfam005"30" .i^w.. /„„4o-"CiR" 

Region 484..584 /reglon_name="Scavenger receptor Cys-ncii /note- SK 

Regiio^487 ?5M /region^^ receptor cysteine-rich domain" 

/note="SRCR'' /db_xref="CDD:pfam00530" /n„f*.--<:iR- 
20 Region 594..692 /region_name="Scavenger receptor Cys-nch /note- SR 

Region 595..692 /region_name=''Scavenger receptor cysteine-rich domain" 
/note="SRCR" /db_xref="CDD:pfam00530" . u- , * --qd" 

Region 723..823 /region_name="Scavenger receptor Cys-nch /note- SR 

Region V26'82'3 /region receptor cysteine-rich domain" 

/note="SRCR"/db_xref="CDD:pfam00530" • k- /n«te-"c;R" 

Region 852..952 /region_name="Scavenger receptor Cys-nch /note- SR 

/db xref="CDD:SR" . . • ■ K^«r«cii«" 

30 Region 855..952 /region_name="Scavenger receptor cysteine-nch domain 
/note="SRCR"/db_xref="CDD:pfam00530" ^.k" /noto-xciR" 

Region 983..1 083 /region_name="Scavenger receptor Cys-nch /note- SR 

/db xref="CDD:SR" ^ ^. • ^ h«.«o!o<' 

Region 986..1083 /region_name="Scavenger receptor cysteine-rich domain 

35 /note="SRCR"/db_xref="CDD:pfam0b530" .i^h- /n«te-"CiR.. 

Region 1 1 12..1212 /region_name="Scavenger receptor Cys-nch /note- SR 

Reg£^Vi?5!'l'212 /region_name="Scavenger receptor cysteine-rich domain" 
/note="SRCR"/db_xref="CDD:pfam00530" rs^h" /n«te--c5R" 

40 Region 1241..1341 /region_name="Scavenger receptor Cys-nch /note- SR 

/db xref="CDD:SR" . * - „u ^^,v,oir,« 

Region 1244..1341 /region_name="Scavenger receptor cysteine-nch domain 

/note="SRCR"/db_xref="CDD:pfam00530" • ^oh" /n,.te-"ciR" 

Region 1370..1470 /region_name="Scavenger receptor Cys-nch /note- SR 

45 /db xref="CDD:SR" ^ . H«r«oir," 

Region 1373. 1470 /region_name="Scavenger receptor cysteine-nch domain 

/note="SReR" /db_xref="CDD:pfam00530" . , ^ ^ /nnt^-"qR" 

Region 1499..1599 /region_name="Scavenger receptor Cys-nch /note- SR 

50 Region 1 502..1599 /region_name="Scavenger receptor cysteine-rich domain" 
/note="SRCR"/db_xref="CDD:pfam00530" .i^h- /nr^te-"<5R" 

Region 1630..1730 /region_name="Scavenger receptor Cys-nch . /note- SR 
/db xref="CDD:SR" 
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Region 1633.. 1730 /region_name="Scavenger receptor cysteine-rlch domain" 
/note="SRCR" /db_xref="CDD:pfam00530" 

Region 1756..1867 /region_nanne="Domain first found in C1r, CIs, uEGF. and bone 
morphogenetic protein." /note="CUB" /db„xref="CDD:CUB" 
5 Region 1756..1864 /region_name="CUB domain" /note="CUB" 
/db_xref="CDD:pfam00431 " 

Region 1 873.. 1 976 /region_name="Scavenger receptor Cys-ricli" /note="SR" • 
/db_xref="CDD:SR" 

Region 1885..1976 /region_name="Scavenger receptor cysteine-ricli domain" 
1 0 /note="SRCR" /db_xref="CDD:pfam00530" 

Region 1 998..21 06 /region_name="Domain first found in C1r, C1s, uEGF, and bone 

morphogenetic protein." /note="CUB" /db_xref="CDD:CUB" 

Region 1 998..21 04 /region_name="CUB domain" /note="CUB" 

/db_xref="CDD:pfam00431" 
1 5 Region 21 1 7..2371 /region_name="Zona pellucida-like domain" 

/note="zona_peilucida" /db_xref="CDD:pfam001 GO" 

Region 21 17..2368 /region_name="Zona pellucida (ZP) domain" /note="ZP" 
/db_xref="CDD:ZP" 

. CDS 1..2403 /gene="DMBT1" /coded_by="NM_01 7579.1 :107..731 8" /note="isofonn 
20 c is encoded by transcript variant 3" /db_xref="LocuslD:1755" 
/db_xref="MliVI:601 969" 

ORIGIN 1 mgistvilem cllwgqvlst ggwiprttdy aslipsevpl dttvaegspf pseltlesta 



• 61 aegspisles tlestvaegs lipsestles tvaegsdsgl alrlvngdgr cqgrveilyr 
121 gswgtvcdds wdtndanwc rqigcgwams apgnawfgqg sgpialddvr csghesylws 
181 cplingwishn cghgedagvi csaaqpqsti rpeswpvris ppvptegses slalrlvngg 
241 drcrgrvevl yrgswgtvcd dywdtndanv vcrqlgcgwa msapgnaqfg qgsgpividd 
301 vrcsghesyl wscphngwit lincghsedag vicsaplsrp tpspdtwpts hastagpess 
361 lalrlvnggd rcqgrvevly rgswgtvcdd swdtsdanw crqigcgwat sapgnarfgq 
421 gsgpivlddv rcsgyesylw scpiingwish ncqtisedagv icsdtlptit Ipastvgses 
481 slalrlvngg drcqgrvevl yrgswgtvcd dswdtndanv vcrqlgcgwa mlapgnarfg 
541 qgsgpividd vrcsgnesyl wscphngwis hncghsedag vicsgpessi alglvnggdr 
601 cqgrvevlyr gswgtvcdds wdtndanwc rqigcgwats apgnarfgqg sgpivlddvr 
661 csghesylws cpnngwishn cghhedagvi csaaqsrstp rpdtlstitl ppstvgsess 
721 Itlrlvngsd rcqgrvevly rgswgtvcdd swdtndanw crqigcgwat sapgnarfgq 
781 gsgpivlddv rcsghesylw scphngwish ncghhedagv icsvsqsrpt pspdtwptsh 
841 astagsessi alrlvnggdr cqgrvevlyr gswgtvcdds wdtsdanvvc rrlgcgwats 
901 apgnarfgqg sgpivlddvr csgyesylws cphngwishn cqhsedagvi csaahswstp 
961 spdtlptit! pastvgsess lalrlvnggd rcqgrvevly qgswgtvcdd swdtndanw 
1021 crqigcgwam sapgnarfgq gsgpivlddv rcsghesylw scphngwish ncghsedagv 
1081 icsasqsrpt pspdtwptsh astagsessi alrlvnggdr cqgrvevlyr gswgtvcddy 
1141 wdtndanwc rqigcgwams apgnarfgqg sgpivlddvr csghesylws cphdgwishn 
1201 cghhedagvi csasqsqptp spdtwptsha stagsessia Irlvnggdrc qgrvevlyrg 
1261 pwgtvcddyw dtndanvvcr qigcgwatsa pgnarfgqgs gpivlddvrc sghesylwsc 
1321 phngwishnc ghhedagvic sasqsqptps pdtwptshas tagsesslal riynggdrcq 
1381 grvevlyrgs wgtvcddywd tndanvvcrq Igcgwatsap gsarfgqgsg pialddvrcs 
1441 ghesylwscp hngwishncg hhedagvics asqsqptpsp dtwptsrast agsestlair 
1501 Ivnggdrcrg rvevlyqgsw gtvcddywdt ndanvvcrql gcgwamsapg naqfgqgsgp 
1561 ividdvrcsg hesylwscph ngwishncgh hedagvicsa aqsqstprpd twittnlpal 
1621 tvgsesslal rivnggdrcr grvevlyrgs wgtvcddswd tndanvvcrq Igcgvvamsap 
1681 gnarfgqgsg pivlddvrcs gnesylwscp hkgwithncg hhedagvics. atqinstttd 
1741 wwhpttttta rpssncggfl fyasgtfssp sypayypnna kcvweievns gyrinlgfsn 
1801 Ikleahhncs fdyveifdgs Insslllgki cndtrqifts synrmtihfr sdisfqhtgf 
1861 lawynsfpsd atlrlvnins syglcagrve iyhggtwgtv cddswtiqea ewcrqigcg 
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1921 ravsalgnay fgsgsgpiti ddvecsgtes tlwqcrnrgw fshncnhred agvicsgnhl 
1981 stpapflnit rpntdyscgg fisqpsgdfs spfypgnypn nakcvwdiev qnnyrvtvif 
2041 rdvqieggcn ydyievfdgp yrsspliarv cdgargsfts ssnfmsirfi sdhsitrrgf 
2101 raeyysspsn dstnllclpn hmqasvsrsy Iqslgfsasd Ivistwngyy ecrpqitpnl 
2161 viftipysgc gtfkqadndt idysnfltaa vsggiikrrt dlrihvscrm Iqntwvdtmy 
2221 iandtihvan ntiqveevqy gnfdvnisfy tsssflypvt srpyyvdinq dlyvqaeilh 
2281 sdavltlfvd .tcvaspysnd ftsltydlir sgcvrddtyg pysspslria rfrfFafhfl 
2341 nrfpsvylrc kmwcraydp ssrcyrgcvl rskrdvgsyq ekvdwigpi qlqtpprree 
2401 epr 



10 



SEQ ID NO: 89 

NP_015568 deleted in malignant brain tumors 1 Isoform b precursor [Homo sapiens] 
gi|6633801 jref |NP_01 5568. 1 1[6633801] 
15 sig_peptide 1..25 

mat _peptide 26..2413 /product="deleted in malignant brain tumors 1. isoform b" 
Region 102.. 202 /region_name="Scavenger receptor Cys-rich" /note="SR" J 
/db_xref="CDD:SR" 

Region 105..202 /region_name="Scavenger receptor cysteine-rich domain" 
20 /note="SRCR" /db„xref="CDD:pfam00530" 

Region 234.. 334 /region_name="Scavenger receptor Cys-rich" /note="SR" 
/db_xref="CDD:SR" 

Region 237..334 /region_name="Scavenger receptor cysteine-rich domain" 
/note="SRCR" /db_xref="CDD:pfam00530" 
25 Region 363..463 /region_name="Scavenger receptor Cys-rich" /note="SR" • 
/db_xref="CDD:SR" 

Region 366.. 463 /region_name="Scavenger receptor cysteine-rich domain" 
/note="SRCR" /db_xref="CDD:pfam00530" 

Region 494..594 /region_name="Scavenger receptor Cys-rich" /note="SR" 
30 /db_xref="CDD:SR" 

Region 497.. 594 /region_name~ 'Scavenger receptor cysteine-rich domain" 
/note="SRCR"/db_xref="CDD:pfam00530" 

Region 602.. 702 /region_name="Scavenger receptor Cys-rich" /note="SR" 
7db_xref="CDD:SR" 

35 Region 605.. 702 /region_name="Scavenger receptor cysteine-rich domain" 
/note="SRCR" /db_xref="CDD:pfam00530" 

Region 733.. 833 /region_name="Scavenger receptor Cys-rich" /note="SR" 
/db_xref="CDD:SR" 

Region 736.. 833 /region_name="Scavenger receptor cysteine-rich domain" 
40 /note="SRCR" /db„xref="CDD:pfam00530" 

Region 862.. 962 /region_name="Scavenger receptor Cys-rich" /note="SR" 
/db_xref="CDD:SR" 

Region 865..962 /region_name="Scavenger receptor cysteine-rich domain" 
/note="SRCR" /db_xref="CDD:pfam00530" 
45 Region 993.. 1 093 /region_name="Scavenger receptor Cys-rich" /note="SR" 
/db_xref="CDD:SR" 

Region 996.. 1 093 /region_name="Scavenger receptor cysteine-rich domain" 
/note="SRCR" /db_xref="CpD:pfam00530" 

Region 1 1 22..1 222 /region__name="Scavenger receptor Cys-rich" /note="SR" 
50 /db_xref="CDD:SR" 

Region 1 125.. 1222 /region_name="Scavenger receptor cysteine-rich domain" 
/note="SRCR" /db_xref="CDD:pfam00530" 
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Region 1 251 1 351 /region^name="Scavenger receptor Cys-rich" /note="SR" 
/db_xref="CDD:SR" 

• Region 1 254. . 1 351 /region_name="Scavenger receptor cysteine-rich domain" 
/note="SRCR" /db_xref="CDD:pfam00530" 
5 Region 1 380.. 1480 /region_name="Scavenger receptor Cys-rich" /note="SR" 
/db_xref="CDD:SR" 

Region 1 383. . 1 480 /region_name="Scavenger receptor cysteine-rich domain" 
/note="SRCR" /db_xref="CDD:pfamOQ530" 

Region 1509.. 1609 /region_name="Scavenger receptor Cys-rich" /note="SR" 
10 /db_xref="CDD:SR" 

Region 151 2.. 1609 /region_name="Scavenger receptor cysteine-rich domain" 
/note="SRCR" /db_xref="CDD:pfam00530" 

Region 1640.. 1740 /region_name="Scavenger receptor Cys-rich" /note="SR" 
/db_xref="CDD:SR" 

15 Region 1643.. 1740 /region_name="Scavenger receptor cysteine-rich domain" 
/note="SRCR" /db__xref="CDD:pfam00530" 

Region 1766.. 1877 /region_name="Domain first found in Cir, C1s, uEGF, and bone 
morphogenetic protein." /note="CUB" /db_xref="CDD:CUB" 
Region 1766.. 1874 /region_name="CUB domain" /note="CUB" 
20 /db_xref="CDD:pfam00431 " 

Region 1883.. 1986 /region„name="Scavenger receptor Cys-rich" /note="SR" 
/db_xref="CDD:SR" 

Region 1895.. 1986 /region_name="Scavenger receptor cysteine-rich domain" 
/note="SRCR" /db_xref="CDD:pfam00530" 
25 Region 2008.. 21 1 6 /region_name="Domain first found in CI r, CI s, uEGF, and bone 
morphogenetic protein." 7note="CUB" /db_xref="CDD:CUB" 
Region 2008..2114 /region_name="CUB domain" /note="CUB" 
/db_xref="CDD:pfam00431" 

Region 2127..2381 /region_name="Zona pellucida-lil<e domain" 
30 /note="zona_pellucida" /db_xref="CDD:pfam001 00" 

Region 2127..2378 /region_name="Zona pellucida (ZP) domain" /note="ZP" 
/db_xref="CDD:ZP" 

CDS 1..24137gene=''DMBT1" /coded_by="NM_007329. 1:107.. 7348" 
/db_xref="LocuslD:1755"/db_xref="iVIIM:601969" 
35 ORIGIN 1 mgistvilem cllwgqvlst ggwiprttdy aslipsevpl dqtvaegspf psestlesta 
61 aegspisles tiestvaegs lipsestles tvaegsdsgl alrlvngdgr cqgrveilyr 
121 gswgtvcdds wdtndanwc rqigcgwams apgnawfgqg sgpialddvr csghesylws 
181 cphngwishn cghgedagvi csaaqpqstl rpeswpvris ppvptegses slalrlvngg 
241 drcrgrvevi yrgswgtvcd dywdtndanv vcrqigcgwa msapgnaqfg qgsgpividd 
40 301 vrcsghesyl wscphngwit hncghsedag vicsapqsrp tpspdtwpts hastagpess 

361 lalrlvnggd rcqgrveviy rgswgtvcdd swdtsdanw crqigcgwat sapgnarfgq 
421 gsgpivlddv rcsgyesylw scphngwish ncqhsedagv icsaahswst pspdtlptit 
481 Ipastvgses slalrlvngg drcqgrvevi yrgswgtvcd dswdtndanv vcrqigcgwa 
541 mlapgnarfg qgsgpividd vrcsgnesyl wscphngvyls hncghsedag vicsgpessi 
45 601 alrlvnggdr cqgrvevlyr gswgtvcdds wdtndanwc rqigcgwams apgnarfgqg 

661 sgpivlddvr csghesylws cpnngwlshn cghhedagvi csaaqsrstp rpdtlstiti 
721 ppstvgsess Itlrlvngsd rcqgrveviy rgswgtvcdd swdtndanw crqigcgwam 
781' sapgnarfgq gsgpivlddv rcsghesylw scphngwish ncghhedagv icsvsqsrpt 
• 841 pspdtwptsh astagsessi alrlvnggdr cqgrvevlyr gswgtvcdds wdtsdanwc 
50 901 rqigcgwats apgnarfgqg sgpivlddvr csgyesylws cphngwishn cqhsedagvi 

961 csaahswstp spdtlptiti pastvgsess lalrlvnggd rcqgrveviy qgswgtvcdd 
1021 swdtndanw crqpgcgwam sapgnarfgq gsgpivlddv rcsghesypw 
. scphngwish 
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ncghsedagv Icsasqsrpt pspdtwptsh astagsessi alrlvnggdr cqgrvevlyr 
gswgtvcddy wdtndanwc rqigcgwams apgnarfgqg sgpivlddvr csghesylws . 
cphngwishn cghhedagvi csasqsqptp spdtwptsha stagsessia Irlvnggdrc 
qgrvevlyrg swgtvcddyw dtndanwcr qigcgwatsa pgnarfgqgs gpivlddvrc 
sghesylwsc phngwishnc ghhedagvic sasqsqptps pdtwptshas tagsesslal 
rivnggdrcq grvevlyrgs wgtvcddywd tndanvvcrq Igcgwatsap gnarfgqgsg 
pivlddvrcs ghesylwscp hngwishncg hhedagvics asqsqptpsp dt\A^tsrast 
agsestlair Ivnggdrcrg rvevlyqgsw gtvcddywdt ndanwcrql gcgwamsapg 
naqfgqgsgp ividdvrcsg hesylwscph ngwishncgh hedagvicsa aqsqstprpd 
twittnlpal tvgsesslal rivnggdrcr grvevlyrgs wgtvcddswd tndanvvcrq 
Igcgwamsap gnarfgqgsg pivlddvrcs gnesylwscp hkgwithncg hhedagvics 
atqinstttd wwhpttttta rpssncggfl fyasgtfssp sypayypnna kcvweievns 
gyrinlgfsn Ikleahhncs fdyveifdgs Insslllgki cndtrqifts synrmtihfr' 
sdisfqntgf lawynsfpsd atlrlvnlns syglcagrve iyhggtwgtv cddswtiqea 
evvcrqlgcg ravsalgn3y fgsgsgpiti ddvecsgtes tlwqcmrgw fshncnhred 
agvicsgnhl stpapflnit rpntdyscgg fisqpsgdfs spfypgnypn nakcvwdiev 
qnnyrvtvif rdvqieggcn ydyievfdgp yrsspliarv cdgargsfts ssnfmsirfi 
sdhsitrrgf raeyysspsn dstnllclpn hmqasvsrsy Iqslgfsasd Ivistwngyy 
ecrpqitpnl viftipysgc gtfkqadndt idysnfltaa vsggiikrrt dirihvscrm 
Iqntwvdtmy iandtihvan ntiqveevqy gnfdvnisfy tsssflypvt srpyyvdinq 
dlyvqaeilh sdavltlfvd tcvaspysnd ftsltydlir sgcvrddtyg pysspslria 
rfrfrafhfl nrfpsvylrc kmvvcraydp ssrcyrgcvl rskrdvgsyq ekvdwigpi 
qlqtpprree epr 

SEQ ID NO: 90 

NP_004397 deleted in malignant brain tumors 1 isoform a precursor [Homo sapiens] 
gi|47581 70|ref|NP_004397.1 1[47581 70] 

sig__peptide 1..25 

mat_peptide 26., 1785 /prod uct="deleted in malignant brain tumors 1 isoform a" 
Region 102..202 /region_name="Scavenger receptor Cys-rich" /note="SR" 
/db_xref="CDD:SR" 

Region 105.. 202 /region_name="Scavenger receptor cysteine-rich domain" 
/note=''SRCR" /db_xref="CDD:pfam00530" 

Region 234.. 334 /region_name="Scavenger receptor Cyis-rich" /note="SR" 
/db_xref="CDD:SR" 

Region 237.. 334 /region_name="Scavenger receptor cysteine«rich domain" 
/note="SRCR" /db_xref="CDD:pfam00530" 

Region 363..463 /region_name="Scavenger receptor Cys-rich" /note="SR" ' 
/db_xref="CDD:SR" 

Region 366.. 463 /region_name="Scavenger receptor cysteine-rich domain" 
/note="SRCR" /db_xref="CDD:pfam00530" 

Region 494..594 /region_name="Scavenger receptor Cys-rich" /note="SR" 
/db_xref="CDD:SR" 

Region 497.. 594 /region_name="Scavenger receptor cysteine-rich domain" 
/note="SRCR" /db__xref="CDD:pfam00530" 

Region 623.. 723 /region_name="Scavenger receptor Cys-rich" /note="SR" 
/db_xref="CDD:SR" 

Region 626.. 723 /region_name="Scavenger receptor cysteine-rich domain" 
/note="SRCR" /db_xref="CDD:pfam0b530" ■ 

Region 752..852 /region_name="Scavenger receptor Cys-rich" /note="SR" 
7db xref="CDD:SR" 



1081 
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Region 755..852 /region_name="Scavenger receptor cysteine-rich domain" 
/note="SRCR" /db_xref="CDD:pfamO0530" 

Region 881 ..981. /region_name="Scavenger receptor Cys-rich" /note="SR" 
/db_xref="CDD:SR" 

5 Region 884..981 /region„name="Scavenger receptor cysteine-rich domain" 
/note="SRCR"/db_xref=':CDD:pfam00530" . 
Region 1012 . 1 112 /regioh_name="Scavenger receptor Cys-rich" /note="SR" 
/db_xref="CDD:SR" 

Region 1015..1112 /region_name="Scavenger receptor cystelne-rich domain" 
1 0 /note="SRCR" /db_xref="CDD:pf am00530" 

Region 1138..1249/region_name="Domain first found in Cir. CIs, uEGF, and bone 

morphogenetic protein." /note="CUB" /db_xref="CDD:CUB" 

Region 1 1 38.. 1246 /region_name="CUB domain" /note="CUB" 

/db_xref="CDD:pfam00431 " 
15 Region 1255..1358 /region_name="Scavenger receptor Cys-rich" /note="SR" 

/db_xref="CDD:SR" 

Region 1267.. 1358 /region_name="Scavenger receptor cysteine-rich domain" 
/note="SRCR" /db_xref="CDD:pfam00530" 

Region 1380.. 1488 /region_name="Domain first found in C1r, CIs, uEGF, and bone 
20 morphogenetic protein." /note="CUB" /db_xref="CDD:CUB" 

Region 1380..1486 /region_name="CUB domain" /note="CUB" 
/db_xref="CDD:pfam00431 " 

Region 1499.. 1751 /region_name="Zona pellucida-Iike domain" 
/note="zonaj3ellucida" /db„xref="CDD:pfam001 00" 
25 Region 1499.. 1750 /region_name="Zona pellucida (ZP) domain" /note="ZP" 
/db_xref="CDD:ZP" 

CDS 1..1785/gene="DMBT1" /coded_by="NM_0044'06.1:107..5464" 
/db_xref="LocuslD:1755" /db_xref="IVIIM:601969" 

ORIGIN 1 mgistvilem cllwgqvlst ggwiprttdy aslipsevpl dqtvaegspf psestlesta 
30 61 aegspisles tiestvaegs lipsestles tvaegsdsgl alrlvngdgr cqgrveiiyr 

121 gswgtvcdds wdtndanwc rqigcgwams apgnawfgqg sgpialddvr csghesylws 
181 cphngwlshn cghgedagvi csaaqpqsti rpeswpvris ppvptegses sialrlvngg 
241 drcrgrvevi yrgswgtvcd dywdtndanv vcrqlgcgwa msapgnaqfg qgsgpividd 
301 vrcsghesyl wscphngwit hncghsedag vicsapqsrp tpspdtwpts hastagpess 
35 361 lalrlvnggd rcqgrvevly rgswgtvcdd swdtsdanw crqigcgwat sapgnarfgq 

421 gsgpivlddv rcsgyesylw scphngwlsh ncqhsedagv icsaahswst pspdtlptit 
481 Ipastvgses slalrlvngg drcqgrvevl yqgswgtvcd dswdtndanv vcrqpgcgwa 
541 msapgnarfg qgsgpividd vrcsghesyp wscphngwls hncghsedag vicsasqsrp 
601 tpspdtwpts hastagsess lalrlvnggd rcqgrvevly rgswgtvcdd ywdtndanvv 
40 661 crqlgcgwarti sapgnarfgq gsgpivlddv rcsghesylw scphngwlsh ncghhedagv 

721 icsasqsqpt pspdtwptsh astagsessi alrlvnggdr cqgrvevlyr gswgtvcddy 
.781 wdtndanwc rqigcgwats apgnarfgqg sgpiylddvr csghesylws cphngwlshn ^ 
841 cghhedagvi csasqsqptp spdtwptsra stagsestia Irlvnggdrc rgrvevlyqg 
901 swgtvcddyw dtndanwcr qigcgwamsa pgnaqfgqgs gpivlddvrc sghesylwsc 
45 961 phngwishnc ghhedagvic saaqsqstpr pdtwittnip altvgsessi alrlvnggdr 

1021 crgrvevlyr gswgtvcdds wdtndanwc rqigcgwams apgnarfgqg sgpivlddvr 
1081 csgnesylws cphkgwithn cghhedagvi csatqinstt tdwwhptttt tarpssncgg 
1141 f [fyasgtfs. spsypayypn nakcvweiev nsgyrinlgf snikleahhn csfdyyeifd 
1201 gslnsslllg kicndtrqif tssynrmtih frsdisfqnt gflawynsfp sdatlrlvnl 
50 1261 nssyglcagr veiyhggtwg tvcddswtiq eaewcrqig cgravsaign ayfgsgsgpi 

1321 tiddvecsgt estlwqcrnr gwfshncnhr edagvicsgn histpapfin itrpntdysc 
1381 ggflsqpsgd fsspfypgny pnnakcvwdl evqnnyrvtv ifrdvqiegg cnydyievfd 
1441 gpyrssplia rvcdgargsf tsssnfmsir fisdhsitrr gfraeyyssp sndstnllcl 
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1501 pnhmqasvsr sylqslgfsa sdlvistwng yyecrpqitp niviftipys gcgtfkqadn 
1 561 dtidysnfit aavsggiikr rtdlrihvsc rmlqntwvdt myiandtihv anntiqveev 
1621 qygnfdvnis fytsssflyp vlsrpyyvdl nqdiyvqaei Ihsdavltif vdtcvaspys 
1681 ndftsltydl irsgcvrddt ygpysspsir iarfrfrafh finrfpsvyl rckmwcray 
5 1 741 dpssrcyrgc virskrdvgs yqekvdvvlg piqiqtpprr eeepr 

SEQ ID NO: 91 

LNBOC1 pulmonary surfactant protein C - bovine 
gi|7428752|pir|lLNBOC1 [7428752] 
1 0 FEATU RES Location/Qualifiers source 1 . .34 /organism="Bos taurus" 
/db_xref="taxon:9913" 

Protein 1..34 /product="pulmonary surfactant protein C" /note="pulmonary surfactant 
protein PSP-6" 

Site 4 /site_type="binding" /note="palmitate (Cys) (covalent)" 
1 5 Site 5 /site_type="binding" /note="palmitate (Cys) (covalent)" 
ORIGIN 1 iipccpvnik rlliwww llvwivgal Imgl 

SEQ ID NO: 92 

LNDGC1 pulmonary surfactant protein C - dog gi|7428750|pir||LNDGC1 [7428750] 
20 FEATURES Location/Qualifiers source 1 ..35 /organism-'Canis famlliaris" 
/db_xref="taxon:961 5" 

Protein 1 ..35 /product="pulmonary surfactant protein C" 
Site 5 /site_type="binding" /note="palmitate (Cys) (covalent)" 
ORIGIN 1 IgipcfpssI kriliivwi vlv>An/ivga llmgl // 



25 



SEQ ID NO: 93 

JN0450 conglutlnin precursor - bovine gi|346501|pir||JN0450[346501] 



FEATURES Location/Qualifiers source 1..371 /organism="Bos taurus" 

30 . /db_xref=''taxon:9913" 

Protein 1..371 /product="conglutinin precursor" /note="C3b-binding protein" 
Region 1..20 /region^name-'domain" /note="signal sequence" 
Region 21 ..371 /region_name="product" /note="conglutinin" 
Region 46..214 /region_name="region" /note="collagen-like" 

35 Site 63 /site_type="binding" /note="carbohydrate (Lys) (covalent)" 
Site 63 /site_type="modified" /note="5-hydroxy]ysine (Lys)" 
Region 75., 371 /region_name="product" /note="conglutinin-N" 
Site 78 /slte_type="modified" /note-"4-hydroxyproline (Pro)" 
Site 87 /site_type="binding" /note="carbohydrate (Lys) (covalent)" 

40 Site 87 /site_type="modified" /note="5-hydroxylysine (Lys)" 
Site 96 /site_type="modified" /note="4-hydroxyproline (Pro)" 
Site 99 /site_type="binding" /note="carbohydrate (Lys) (covalent)" 
Site 99 /site_type="modified" /note="5-hydroxylysine (Lys)" 
Site 108 /site_Jype="modified" /nbte="4-hydroxyproline (Pro)" 

45 Site 1 1 1 /siteJype="modified" /note="4-hydroxyproline (Pro)" 
Site 129 /site_type="modified" /note="4-hydroxyproline (Pro)" 
- Site 1 32 /site_type="modified'* /note="4-hydroxyproIine (Pro)" 
Site 135 /site_type="binding" /note="carbohydrate (Lys) (covalent)" 
• Site 135 /siteJype="modified" /note="5-hydroxylysine (Lys)" 

50 Site 141 /site_type="binding" /note="carbohydrate (Lys) (covalent)" 
Site 141 /site_type="modified" /note="5-hydroxylysine (Lys)" . 
Site 147 /site_type="modified" /note="4-hydroxyproline (Pro)" • 
Site 1 53 /site_type="modified" /note="4-hydroxyprorme (Pro)" 
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Site 159 /site_type="binding" /note="carbohyclrate (Lys) (covalent)" 
Site 159 /site_type="moclified" /note="5-hydroxytysine (Lys)" 
Site 162/site_type="binding"/note="carbohydrate (Lys) (covalent)" 
Site 162 /site_type="modified" /hote="5-hydroxylysine (Lys)" 
5 Site 171 /site_type="modified" /note="4-hydroxyproline (Pro)" 
Site 195 /site_type="modified" /note="4-hydroxyproiine (Pro)" 
Site 198 /site_type="binding" /note="carbohydrate (Lys) (covalent)" 
Site 198/site_type="modified"/note="5-hydroxylysine (Lys)" 
Site 210 /site_type="binding" /note="carbohydrate (Lys) (covalent)" 
10 Site 21 0 /site_type="modified" /note="5-hydroxylysine (Lys)" 

Region 248..369 /region_name="domain" /note="C-type lectin homology #label 
LCH" 

Site 337 /site_type="binding" /note="carbohydrate (Asn) (covalent)" 
ORIGIN 1 mlllplsvll lltqpwrslg aemttfsqki lanactlvmc splesglpgh dgqdgrecph 
15 61 gekgdpgspg pagragrpgw vgpigpkgdn gfvgepgpkg dtgprgppgm 

pgpagregps 

121 gkqgsmgppg tpgpkgetgp kggvgapgiq gfpgpsglkg ekgapgetga pgragvtgps 
181 gaigpqgpsg argppglkgd rgdpgetgak gesglaevna Ikqrvtildg hlrrfqnafs 
241 qykkavlfpd gqavgekifk tagavksysd aeqicreakg qiasprssae neavtqmvra 
20 301 qeknaylsmn distegrfty ptgeilvysn wadgepnnsd egqpencvel fpdgkwndvp 

361 cskqilvicef 

SEQ ID NO: 94 

A45225 pulmonary surfactant protein D precursor - human 
25 gi|346375|pirl|A45225[346375] 

FEATURES Location/Qualifiers source 1..375 /organism="Homo sapiens" 
/db_xref="taxon: 9606" 

Protein 1 . .375 /product="pulmonary surfactant protein D precursor" /note="SP-D" 
Region 1..20 /region_name="domain" /note="signal sequence" 

30 Region 21 ..375 /region_name="product" /note="pulmonary surfactant protein D" 
Region 21.. 45 /region_name="domain" /note="non-collagenous" 
Region 46.. 222 /region_name="domaln" /note="collagenous" 
Site 90 /site_type="binding" /note="carbohydrate (Asn) (covalent)" 
Region 223.-375 /region_name="domain" /note="non-collagenous" 

35 Region 254. .373 /region_name="domain" /note="C-type lectin homology #label 
LCH" 

Bond bond(281,373) /bond_type="disulfide" 
Bond bond(351.365) /bond_type="disulfide" 

ORIGIN 1 mllfllsalv lltqplgyle aemktyshrt mpsactlvmc ssvesglpgr dgrdgregpr 
40 61 gekgdpglpg aagqagmpgq agpvgpkgdn gsvgepgpkg dtgpsgppgp 

pgvpgpagre 

121 galgkqgnig pqgkpgpkge agpkgevgap gmqgsagarg lagpkgergv 
pgergvpgnt 

181 gaagsagamgpqgspgargp pglkgdkgip gdkgakgesg Ipdvaslrqq vealqgqvqh 
45 241 Iqaafsqykk velfpngqsv gekifktagf vkpfteaqll ctqaggqias prsaaenaal 

301 qqiwaknea aflsmtdskt egkftyptge sivysnwapg epnddggsed cveiftngkw 
361 ndracgekri wcef 

SEQ ID NO: 95 

50 LNHUC pulmonary surfactant protein C precursor, long splice form - human 
gi|71 983|pir||LNHUC[7i 983] 
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FEATURES Location/Qualifiers source 1 .. 1 97 /organism="Homo sapiens" 
/db_xref="taxon:9606" 

Protein 1..197 /product="pulmonary surfactant protein C precursor, long splice form" 
/note="3.7 kDa surfactant polypeptide; pulmonary surfactant protein SP5; pulmonary 
5 surfactant proteolipid SP-C; pulmonary surfactant proteolipid SPL(pVal)" 

Region 1..197 /region_name="product"/note="pulmonary surfactant protein C 
precursor, short splice form" 

Region 1..145 /region_name="product" /note="pulmonary surfactant protein C 
precursor, short splice form" 
10 Region 1 ..23 /region_name="domain" /note="propeptide" 

. Region 24..58 /region_name="product" /note="pulmonary surfactant protein C 
Site 28 /site_type="binding" /note="palmitate (Cys) (covalent)" 
Site 29 /site_type="binding" /note="palmitate.(Cys) (covalent)" 
Region 152.. 197 /region_name="product" /note="pulmonary surfactant protein C 
15 precursor, short splice form" ORIGIN 1 mdvgskevim esppdysaap rgrfgipccp vhlkrllivv 
wwlivvvi vgallmglhm 

61 sqkhtemvie msigapeaqq rialsehlvt tatfsigstg IwydyqqII iaykpapgtc 
121 cyimkiapes ipslealnrk vhnfqmecsl qakpavptsk Igqaegrdag sapsggdpaf 
181 Igrhavntlcgevplyyi 

20 

SEQ ID NO: 96 

LNDGPS pulmonary surfactant protein A precursor — dog 
gi|71970lpir||LNDGPS[71970] 

FEATURES Location/Qualifiers source 1 ..248 /organism="Ganis farnlllarls" 
25 /db_xref="taxon:9615" 

Protein 1 ..248 /product="puImonary surfactant protein A precursor" 
/note="pulmonary surfactant 32K apoprotein; pulmonary surfactant-associated 
protein PSP-A" 

Region 1..17 /region_name="domain" /note="signal sequence" 
30 Region 1 8..248 /region_name="product" /note="puImonary surfactant protein A" 

Site 20 /site_type="binding" /note="carbohydrate (Asn) (covalent)" 

Region 28.. 102 /region__name="region" /note="collagen-like" 

Site 30 /site_type="modified" /note="4-hydroxyproIine (Pro)" 

Region 127..246 /region_name="domain" /note="C4ype lectin homology #label 
35 LCH" 

Site 207 /site_type="binding" /note="carbohydrate (Asn) (covalent)" 

ORIGIN 1 mwlrclalal tilmvsgien ntkdvcvgnp gipgtpgshg Ipgrdgrdgv kgdpgppgpl 

61 gppggmpghp gpngmtgapg vagergekge pgergppglp asldeelqtt Ihdirhqilq 
121 tmgvlslhes llwgrkvfs sgaqsinfnd iqelcagagg qiaapmspee neavasivkk 
40 181 yntyaylglv espdsgdfqy mdgapvnytn wypgeprgrg keqcvemytd gqwnnknclq 

241 yrlaicef 

SEQ ID NO: 97 

LNHUPS pulmonary surfactant protein A precursor (genomic clone) - human 
45 gi|71967|pir||LNHUPS[71967] 

FEATURES Location/Qualifiers source 1 ..248 /organism="Homo sapiens" 
/db_xref="taxon:9606" 

Protein 1 ..248 /product="pulmonary. surfactant protein A precursor (genomic clone)" 
/hote="alveolar proteinosis protein; pulmonary surfactant 32K apoprotein; pulmonary 
50 surfactant-associated protein (PSP-A)" 

Region 1..20/region_name="domain" /note="signaI sequence" 

Region 21. .248 /region_name="product" /note="pulmonary surfactant protein A" 

Bond bond(26) /bond_type="disulfide" /note="interchain" 
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• Region 28.. 100 /region_name="domain"/note="collagenous" 

Site 30 /site_type="mociified" /note="4-hydroxyproline (Pro)" 

Site 33 /site_type="modifled" /note="4-hydroxyproline (Pro)" 

Site" 36 /site_type="modified" /note="4-hydroxyproline (Pro)" 
5 Site 42 /site_type="modified" /note="4-hydroxyproline (Pro)" 

Site 51 /site_type="modified" /note="5-liydroxylysine (Lys)" 

Site 57 /site_type="modified" /note="4-hydroxyproline (Pro)" 

Site 63 /site_type="modified" /note="4-liydroxyproline (Pro)" 

Site 76 /site_type="modified" /note="4-hiydroxyproline (Pro)" 
10 Site 79 /site_type="m6dified" /note="4-hydroxyproline (Pro)" 

Site 82 /site_type="modified" /note="4-liydroxyproline (Pro)" 

Site 88 /site_type="modified" /note="5-hydroxylysine (Lys)" 

Site 91 /site_type="modified" /note="4-hydroxyproline (Pro)" 

Site 97 /site_type="modified" /note- •4-liydroxyproline (Pro)" 
1 5 Region 1 27. .246 /region_name=''domain" /note="C-type lectin homology #label 

LCH" 

Bond bond(1 55,246) /bond_type="disulfide" 
Site 207 /site_type="binding" /note="carboliydrate (Asn) (covalent)" 
Bond bond(224.238) /bond_type="disulfide" 
20 ORIGIN 1 mwlcplalnl ilmaasgavc evkdvcvgsp gipgtpgsfig Ipgrhgrdgl kgdigppgpm 
61 gppgempcpp gndglpgapg ipgecgekge pgergppglp ahldeelqat Ihdfrhqilq 
121 trgalslqgs inntvgekvfs sngqsitfda Iqeacaragg riavprnpee neaiasfvkk 
181 yntyayvgit egpspgdfry sdgtpvnytn wyrgepagrg keqcvemytd gqwndmcly 
241 sritlcef 



25 



45 



SEQ ID NO: 98 

A53570 collectin-43 - bovine gi|1 08301 7|pir||A53570[1 08301 7] 



FEAtURES Location/Qualifiers source 1..301 /organism="Bos taurus" 
30 /db^xref="taxon:9913" 

Protein 1..301 /product="collectin-43" /note="lectin CL-43" 

Region 177..299 /region_name="donnain" /note="C-type lectin homology #label 

LCH" 

35 ORIGIN 1 eemdvysekt ltdpctlwc appadslrgh dgrdgkegpq gekgdpgppg mpgpagregp 
61 sgrqgsmgpp gtpgpkgepg peggvgapgm pgspgpaglk gergapgpgg 
aigpqgpsga 

121 mgppglkgdr gdpgekgarg etsvlevdtl rqrmrnlege vqriqnivtq yrkavlfpdg 
181 qavgekifkt agavksysda eqicreakgq lasprssaen eavtqivrak nkhaylsmnd 
40 241 iskegkftyp tggsidysnw apgepnnrak degpenclei ysdgnwndie creerlvice 301 

f 



SEQ ID NO: 99 

S33603 surfactant protein D - bovine gi|423283|pir||S336.03[423283 



FEATURES Location/Qualifiers source 1 ..369 /organism="Bos taurus" 
/db_xref="taxon:991 3" 

Protein 1 ..369 /product="surfactant protein D" 

Region 248..367 /region_name="domain" /note="C-type lectin homology #label 
50 LCH" 

ORIGIN 1 mlllplsvil lltqpwrslg aemkiysqkt manactlvmc sppedglpgr dgrdgregpr 

61 gekgdpgspg pagragmpgp.agpiglkgdn gsagepgpkg dtgppgppgm 
pgpagregps • 
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121 gkqgsmgppg tpgpkgdtgp kggvgapgiq gspgpaglkg ergapgdpga 
pgragapgpr 

181 gaigpqgpsg argppglkgd rgtpgergak gesglaevna Irqrvgileg qlqriqnafs 
241 qykkamlfpn grsvgekifk tvgsektfqd aqqictqagg qlpsprsgae nealtqiata 
5 301 qnkaaflsms dtrkegtfiy ptgeplvysn wapqepnndg gsencveifp ngkwndkvcg • 

361 eqrivicef 

SEQIDNO:100 
. AAF28384 lung surfactant protein A [Sus scrofa] 
1 0 gi|6782434|gb|AAF28384. 1 |AF1 33668_1 [6782434] 

FEATURES Location/Qualifiers source 1..1 16 /organism="Sus scrofa" 
/db_xref="taxon:9823" 

Protein <1..116/product="lung surfactant protein A" /function="involved in the Innate 
immune system and lipid homeostasis within the lung" /name="collectin; SPA; SP-A" 
1 5 CDS 1 ..1 1 6 /gene="SFTPA" /coded_by="AF1 33668.1 :<1 ..353" 

ORIGIN 1 avgekvfstn gqsvafdvir elcaraggri aaprspeene aiasivkkhn tyaylglveg 
61 ptagdffyld gtpvnytnwy pgeprgrgke kcvemytdgq wndrncqqyr laicef 

SEQIDNO: 101 

20 AAF22145 lung surfactant protein D precursor; SPD; SP-D; CP4 [Sus scrofa] • 
gi|6760482|gb|AAF22145.2|AF1 32496_1 [6760482] 
sig_peptide 1..20 

mat_peptide 21 ..378 /product="lung surfactant protein D" 
CDS.1 ..378 /gene="SFTPD" /coded_by="AF1 32496.2:44.. 1 180" 
25 ORIGIN 1 mlllplsvli lltqpprslg aemktysqra vanacalvmc spmenglpgr dgrdgregpr 
61 gekgdpglpg avgragmpgl agpvgpkgdn gstgepgakg digpcgppgp 
pgipgpagke 

121 gpsgqqgnig ppgtpgpkge tgpkgevgal gmqgstgarg paglkgerga pgergapgsa 
. 181 gaagpagatg pqgpsgargp pglkgdrgpp gergakgesg Ipgitalrqq vetlqgqvqr 
30 241 Iqkafsqykk velfppgrgv gekifktggf ektfqdaqqv ctqaggqmas prseteneal 

301 sqlvtaqnka aflsmtdikt egnftyptge pivyanwapg epnnnggssg aencveifpn 
361 gkwndkacge Irlvicef 
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SEQ ID NO: 102 

P15783 PULMONARY SURFACTANT-ASSOCIATED PROTEIN C (SP-C) 
(PULMONARY SURFACTANT-ASSOCIATED PROTEOLIPID SPL(VAL)) 
gi|1 31 422|sp|P15783|PSPC_BOVIN[1 31422] 

5 

FEATURES Location/Qualifiers source 1 ..34 /organlsm="Bos taurus" 
. /clb_xref="taxon:9913" 
gene 1 ..34 /gene="SFTPC" /note="SFTP2" 

Protein 1 ..34 /gene="SFTPC" /product="PULMONARY SURFACTANT- 
10 ASSOCIATED PROTEIN C" 

Site 4 /gene="SFTPC" /site_type="lipid-binding" /note="PALMITATE (BY 
SIMILARITY)." 

Site 5 /gene="SFTPC" /site_type="lipid-binding" /note="PALMITATE (BY 
SIMILARITY)." 

15 Region 21 /gene="SFTPC" /region_name="Conflicf /note="L -> V (IN REF. 2)." 
Region 26 /gene="SFTPC" /region_name="Conflict" /note="l -> V (IN REF. 2)." 
Region 28..34 /gene="SFTPC" /region_name="Conflict" /hote="GALLMGL -> 
IGAMLAM (IN REF. 2)." 
ORIGIN 1 lipccpvnik rlliwvvw llwvivgal Imgl 

20 

SEQ ID NO: 103 

P35246 PULMONARY SURFACTANT-ASSOCIATED PROTEIN D PRECURSOR 
(SP-D) (PSP-D) 

gi|464485|sp|P35246|PSPb_BOVIN[464485] 

FEATURES Location/Qualifiers source 1..369 /organlsm="Bos taurus" 
/db_xref="taxon:991 3" 
gene 1 ..369 /gene="SFTPD" /note="SFTP4" 

Protein 1..369 /gene="SFTPD" /product="PULMONARY SURFACTANT- 
30 ASSOCIATED PROTEIN D PRECURSOR" 

Region 1..20 /gene="SFTPD" /region_name="Signal" /note="BY SIMILARITY." 

Region 21 ..369 /gene="SFTPD" /region_name="Mature chain" 

/note="PULMONARY SURFACTANT-ASSOCIATED PROTEIN D." 

Region 46..216 /gene="SFTPD" /region_name="Domain" /note="COLLAGEN-LIKE.' 
35 Site 78 /gene="SFTPD" /site_type="hydroxylalion" /note="(BY SI Ml LARITY)." 

Site 87 /gene="SFTPD" /site_type="hydroxylation" /note="(BY SIMILARITY)." 

Site 90 /gene="SFTPD" /site_type="glycosylation" /note="POTENTIAL." 

Site 96 /gene="SFTPD" /site_type="hydroxylation" /note="(BY SIMILARITY)." 

Site 99 /gene="SFTPD" /site_type="hydroxylation" /note="(BY SIMILARITY)." 
40 Site 165 /gene='*SFTPD" /site_type="hydroxylation" /note="(BY SIMILARITY)." 

Site 171 /gene="SFTPD" /site_type="hydroxylation" /note="(BY SIMILARITY)." 

Region 21 7.. 248 /gene-'SFTPD" /region_name="Domain" /note="COlLED COIL 

(POTENTIAL)." 

Region 273..369 /gene="SFTPD" /region_name="Domaln" /note="C-TYPE LECTIN 
45 (SHORT FORM)." 

Bond bond(275.367) /gene="SFTPD" /bond_type="disulfide" /note="BY 
SIMILARITY." 

Bond bond (345,359) /gene="SFTPD" /bond_type="disulfide" /note="BY 
SIMILARITY." 

50 ORIGIN 1 mlllplsvll lltqpwrslg aemklysqkt manactlvmc sppedglpgr dgrdgregpr 
61 gekgdpgspg pagragmpgp agpiglkgdn gsagepgpkg dtgppgppgm 
pgpagregps 



25 
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121 gkqgsmgppg tpgpkgdtgp kggvgapgiq gspgpaglkg ergapgepga 
pgragapgpa 

181 gaigpqgpsg argppglkgd rgtpgergak gesglaevna Irqrvgileg qlqriqnafs 
241 qykkamlfpn grsvgekifk tvgsektfqd aqqictqagg qlpsprsgae nealtqiata 
5 301 qnkaaflsms dtrkegtfiy ptgeplvysn wapqepnndg gsencveifp ngkwndkvcg 

361 eqrivicef 

SEQIDNO:104 

P42916 COLLECTIN-43 (CL-43) gill 168967|sp|P42916|CL43_BOVIN[1 168967] 
10 FEATURES Location/Qualifiers source 1..301 /organism="Bos taurus" 
/db_xref ="taxon:991 3" 
Protein 1..301 /product="COLLECTIN-43" 

Region 29..142 /region_name="Domain" /note="COLLAGEN-LIKE (G-X-Y)." 
Region 202..301 /region_name="Domain" /note="C-TYPE LECTIN (SHORT 
15 FORM)." 

Bond bond(264,299) /bond Jype="disulfide" /note="BY SIMILARITY." 
Bond bond(277,291) /bond_type="disulfide" /note="BY SIMILARITY." 
ORIGIN 1 eemdvysekt ltdpctlwc appadslrgh dgrdgkegpq gekgdpgppg mpgpagregp 
61 sgrqgsmgpp gtpgpkgepg peggvgapgm pgspgpaglk gergapgpgg 
20 aigpqgpsga 

121 mgppglkgdr gdpgekgarg etsvlevdtl rqrmrnlege vqriqnivtq yrkavlfpdg 
181 qavgekifkt agavksysda eqicreakgq lasprssaen eavtqivrak nkhaylsmnd 
241 iskegkftyp tggsldysnw apgepgnrak degpenclei ysdgnwndie creerlvice 
301 f 

25 

SEQIDNO: 105 

CAB56155 DMBT1/8kb.2 protein [Homo sapiens] 
9i|591 2464Iemb|CAB56155.1 1[591 2464] 
sig_peptide 1..26 

30 mat^peptide 26..2412 /product="DMBT1/8kb.2 protein" 

CDS 1..2412 /gene="DMBT1" /Goded_by="AJ243212.1:107..7345" 
/note="Sequence is an alternative splice form of the DMBT1 gene that is expressed 
in human adult trachea. Isoforms of DMBT1 are identical to the collectin binding 
protein gp-340. Full-length cDNA clone contains 1 bp deletions in codons 100 and 

35 1 751 , that were corrected by comparison with the genomic exons" 

ORIGIN 1 mgistvilem cllwgqvlst ggwiprttdy aslipsevpl dttvaegspf pseltlestv 
61 aegspisles tiettvaegs lipsestles tvaegsdsgl alrlvngdgr cqgrveilyr 
1 21 gswgavcdds wdtndanvvc rqlgcgwams apgnawfgqg sgpialddvr csghesylws 
181 cphngwishn cghgedagvi csaaqpqsti rpeswpvris ppvptegses slalrlvngg 

40 241 drcrgrvevi yrgswgtvcd dywdtndanv vcrqigcgwa msapgnaqfg qgsgpividd 
301 vrcsghesyl wscphngwit hncghsedag vicsapqsrp tpspdtwpts hastagpess 
361 lalrlvnggd rcqgrvevly rgswgtvcdd swdtsdanvv crqigcgwat sapgnarfgq 
421 gsgpivlddv rcsgyesylw scphngwish ncqhsedagv icsaahswst pspdtlptit 
481 Ipastvgses slalrlvngg drcqgrvevi yrgswgtvcd dswdtndanv vcrqigcgwa 

45 541 mlapgnarfg qgsgpividd vrcsgnesyl wscphngwis hncghsedag vicsgpessi; 
601 alrlvnggdr cqgrvevlyr gswgtvcdds wdtndanvvc rqlgcgwams apgnarfgqg 
661 sgpivlddvr csghesylws cpnngwishn cghhedagvi csaaqsrstp rpdtlstiti 
721 ppstvgsess Itlrlvngsd rcqgrvevly rgswgtvcdd swdtndanw crqigcgwat 
781 sapgnarfgq gsgpivlddv rcsghesylw scphngwish ncghhedagv icsvsqsrpt 

50 ' 841 pspdtwptsh astagpessi alrlvnggdr cqgrvevlyr gswgtvcdds wdtsdanvvc 
901 rqigcgwats apgnarfgqg sgpivlddvr csgyesylws cphngwishn cqhsedagvi 
961 csaahswstp spdtlptiti pastvgsess lalrlvnggd rcqgrvevly qgswgtvcdd 
1021 swdtndanvy crqigcgwam sapgnarfgq gsgpivldda rcsghesylw scphngwish 
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•1081 ncahsedagv icsasqsrpt pspdtwptsh astagsessi alr'vnggdr cqgrvevlyr 
?4 gSvcddy wdtndanvac rqlgcgwams apgnarfgqg gP^^^?,!,'-,^^^^^^^^^^^ 
20 Ephngwishn cghhedagvi csasqsqptp spdtwptsha ^tagsessla Irlvnggdrc 
1261 aSrvevlyrg swgtvcddyw dtndanvvcr qigcgwatsa pgnarfgqgs gP'v'ddvrc 
32 STyKhngwIsJJnic ghhedagvic sasqsqptps Pdtwptshas tagsesslal 
1^81 rivnaadrcq qrve^^yrgs wgtvcddywd tndanwcrq Igcgwatsap gnarfgqgsg 
S . pSvrSgrel^^^^^^^ hngwlshncg hhedagvlcs afqsqptpsp d^^^^^^^ 
1 501 aaSstlaIr Ivnggdrcrg rvevlyqgsw gtvcddywdt ndanvycrq gcQwamsapg 
^61 SSaaSp ivldd^^^^^^ hepylwscph ngwishncgh hedagvicsa aqsqstprpd 
62 SSlKsTsslal rlvnggdrcr g^/evlyrgs wgtvcddswd t^^^^^^ 
iR8l tacowamsap gnarfgqgsg pivlgdvrcs gnesylwscp hkgwithncg hhedagvics 
?4 Smd ^&ta r^^^^^^ sypayypnna kcywe.evns 

80 gyr nfgfsnrea^h Insslllgki cndtrqitts synrrnbhfr 

1861 sd sfantqf lawynsfpsd atlrlvnins syglcagrve iyhggtwgav cddswtiqea 
921 |wc?qTqcg S ddvecsgtes tlwqcrnrgw fshncnhred 

g agvSsgrS Spapflnit rVnysc^gf ^-^^-?<^^-- S'^'HTJ^^^^^^ 
2041 nnvrvtvifr dvqieggcny dyievfdgpy rsspiiarvc dgargsftss snfmsirfis 
2?01 ShSr ae?ysspsnd stnllclpnh mqasvsrsyl qslgfsasdl vstwngyye 
2161 Spn V iftSysgcg tfkqadndti dysnlltaav sggiikrrtd Inhvscrm 
12^1 qnSmyi andt»ivann tiqveevqyg nfdvnisfyt sssflypvts rpvy^dlnqd 
2281 iwSihs davltlfvdt cvaspysndf tsltydlirs gcvrddtygp ysspslnar 
2S1 iSin rf^^^^^^^ mwcraydps srcyrgcvir skrdvgsyqe kvdwlgp.q 
2401 Iqtpprreee pr 

25 SEQ1DNO:106 „^ 
AAD49696 gp-340 variant protein [Homo sapiensj 

cysteine-rich protein SRCR" /note="putat.ve rece^or fo^ 
rnq 1 2413 /aene="D!\/IBT1" /coded by=''AF159456.1:107..7348 

gwo N ^rngSS^^Im clTwgqvlst ggwiplttdy aslipsevpl dqtvaegsp psestlesta 
R1 aeosD sies tlestvaegs lipsestles tvaegsdsgl alrlvngdgr cqgryeilyr 
121 glwgfvSd^^^^ rqlgcgwams apgnawfgqg -9P'^'f^^I,^fg^;^^^^^^ 

181 cDhnqwIshn cghgedagvi csaaqpqsti rpeswpvns ppvptegses sialrlvngg 
24 ScrXvWrSwgtvcd dyw^^^ msapgnaqfg qgsgpivWd 

lo vrSK wscph^ hncghsedag vicsapqsrp tpspdtwpts h^st^^^^^^ 
361 lalrlvnqgd rcqgrvevly rgswgtvcdd swdtsdanvv crqlgcgwat sapgnarfgq 
42 SSdv rSgyesylw scphngwish ncqhsedagv icsaahswst pspctt pW 
Is IpasCses slalrilngg drcqgrvevi yrgswgtvcd f ^^tndanv vc qj^^^^^^^ 
541 mlapgnarfg qgsgpividd vrcsgnesyl wscphngwls hncghsedag vicsgpessl 
60 S tnggdr cqgrvevlyr gswgtvcdds wdtndanwc rqlgcgwams apgnarfgqg 
RR1 saDivSdvr csqhesylws cpnngwlshn cghhedagvi csaaqsrstp rpdtlstitl 
?21 Sqs^ s r^^^^^^^^ rgswgtvcdd swdtndanvv crqigcgwam 

78 sapqnarfgq gsgS rcsghesylw scphngwish ncghhedagv .csvsqsrpt 
S SpStiraste^esslalrlWdrcqgrvevlyrgswg^^^^^^^ 
901 rqigcgwats apgnarfgqg sgpivlddvr csgyesylws ^Phngwlshn c^ql^sedagvi 
QR1 SaahswstD spdtlptitl pastvgsess lalrlvnggd rcqgrvevly qgswgtvcdd _ 
iSlsw^tSv c^^^^^^^^ Ssgpivlddv rcsghesypw scphngwish 

Z\ ncgS^edagvJcsasqsrpt pspdtwptsh astagsessi ^Irlvnggd^^^^^^^^^ 
1141 qswgtvcddy wdtndanwc rqlgcgwams apgnarfgqg sgpivlddvr csghesyiws 
5201 Shngwishn cghhedagvi ciasqsqptp spdtwptsha stagsessla Irlvnggdrc 
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1261 qgrvevlyrg swgtvcddyw dtndanvvcr qigcgwatsa pgnarfgqgs gpivlddvrc 
1321 sghesylwsc phngwishnc ghhedagvic sasqsqptps pdtwptshas tagsesslal 
1381 rivnggdrcq grvevlyrgs wgtvcddywd tndarivvcrq Igcgwatsap gnarfgqgsg 
1441 pivlddvrcs ghesylwscp hngwishncg hhedagvics asqsqptpsp dtwptsrast 
5 • 1501 agsestlalr Ivnggdrcrg rvevlyqgsw gtvcddywdt ndanvvcrql gcgwamsapg 
. ■ 1561 naqfgqgsgp ividdvrcsg hesytwscph ngwishncgh hedagvicsa aqsqstprpd 
1621 twittnlpal tvgsesslal rlvnggdrcr grvevlyrgs wgtvcddswd tndanwcrq 
1681 Igcgwamsap gnarfgqgsg pivlddvrcs gnesylwscp hkgwithhcg. hhedagvics 
1741 atqinstttd wwhpttttta rpssncggfl fyasgtfssp sypayypnna kcvweievns 

10 ' .1801 gyrinlgfsn Ikleahhncs fdyveifdgs Insslllgki cndtrqifts synrmtihfr 

1861 sdisfqntgf lawynsfpsd atlrlvnins syglcagrve iyhggtwgtv cddswtiqea 
1921 evvcrqigcg ravsalgnay fgsgsgpiti ddvecsgtes tlwqcmrgw fshncnhred 
1981 agvicsgnhl stpapflnit rpntdyscgg flsqpsgdfs spfypgnypn nakcvwdiev 
2041 qnnyrvtvif rdvqieggcn ydyjevfdgp yrsspliarv cdgargsfts ssnfmsirfi 

15 2101 sdhsitrrgf raeyysspsn dstnllclpn hmqasvsrsy Iqslgfsasd Ivistwngyy 
2161 ecrpqitpnl viftipysgc gtfkqadndt idysnfltaa vsggiikrrt dlrihvscrm 
2221 Iqntwvdtmy landtihvan ntiqveevqy gnfdvnisfy tsssflypvt srpyyvdlnq 
2281 dlyvqaeilh sdavltlfvd tcvaspysnd.ftsltydlir sgcvrddtyg pysspslria 
, 2341 rfrfrafhfl nrfpsvylrc kmwcraydp ssrcyrgcvl rskrdvgsyq ekvdwigpi 

20 2401 qlqtpprree epr 

SEQIDNO:107 

AAD31380 surfactant protein D precursor [Mus musculus] 
gi|4877556|gb| AAD31 380. 1 |AF047742_1 [4877556] 
25 sig_peptide 1-.19 

mat^peptide 20. .374 /product="surfactant protein D" 
CDS 1 ..374 /gene="Sftp4" 

/coded_by="join(AF047741 . 1 :5705..5900.AF047742.1 :31 2..428. 
AF047742.1 :669..785,AF047742.1 :1 1 12..1228, 
30 AF047742.1 :1977,.2093,AF047742.1:3162..3245, AF04774Z 1:5010.. 5386)" 
ORIGIN 1 mlpflsmlvl Ivqplgnlga emkslsqrsv pntctlvmcs ptenglpgrd grdgregprg 

61 ekgdpglpgp mglsglqgpt gpvgpkgeng sagepgpkge rglsgppglp gipgpagkeg 
121 psgkqgnigp qgkpgpkgea gpkgevgapg mqgstgakgs tgpkgergap 
gvqgapgnag 

35 181 aagpagpagp qgapgsrgpp glkgdrgvpg drgikgesgl pdsaalrqqm ealkgklqrl 

241 evafshyqka alfpdgrsvg dkifrtadse kpfedaqemc kqaggqiasp rsatenaaiq 
301 qlitahnkaa fismtdvgte gkftyptgep Ivysnwapge pnnnggaenc veiftngqwn 
361 dkacgeqriv icef 

40 SEQIDNO: 108 

B61249 pulmonary* surfactant protein C- dog gi|539712|pir||B61249[539712] 
FEATURES Location/Qualifiers source 1..35 /organism="Canis familiaris" 
/db_xref="taxon:961 5" 

Protein 1..35 /product="pulmonary surfactant protein C" . 
45 ORIGIN 1 IgipcfpssI kriliivvvi vlwvvivga llmgl 

SEQIDNO: 109 

S00609 pulmonary surfactant protein C - bovine gi|89749|pir||S00609[89749] 

50 FEATURES Location/Qualifiers source 1 ..34 /organism="Bos taurus" 
/db„xref="taxon:9913" 
. Protein 1 ..34 /product-'pulmonary surfactant protein C" /note="pulmonary surfactant 
protein PSP-6" 
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isite 4 /site_type="binding" /note="palmitate (Cys) (covalent)" 
Site 5 /site_type="binding" /note="palmitate (Cys) (covalent)" 
ORIGIN 1 lipccpvnik rlllwvwv llvwivgal Imgl 

5 SEQ ID NO: 110 

A43628 pulmonary surfactant protein A - human (fragments) 
gi|280854|pir||A43628[280854] 

FEATURES Location/Qualifiers source 1..35 /organism-'Homo sapiens" 
/db_xref="taxon:9606" 
10 Protein 1 ..35 /product="pulmonary surfactant protein A" 
ORIGIN 1 gqsitfdagk eqcvemytdg qwndrnclyl ticef 

SEQ ID NO: 111 

AAB48076 Surfactant protein B (SR-B) [Oryctolagus cuniculus] 
15 gi|1850933|gblAAB48076.1|[1 850933] 

FEATURES Location/Qualifiers source 1 ..370 /organism-'Oryctolagus cuniculus" 

/db_xref="taxon:9986"/tissue_type="liver 

Protein 1 ..370 /product="Surfactant protein B (SP-B)" 

CDS 1..370/gene="SP-B" 
20 /cDded_by="join(U40853.1:2194..2263,U40853.1:2591..2718. 

U40853. 1 :2941 . .301 2,U40853. 1 :3257..3382, 

U40853. 1 :3590. .3727,U40853. 1 :3925. .401 4, 

U40853. 1 :6043..6226,U40853.1 :6421 ..6581 , 

U408&3.1:7266..7346,U40853.1:7829..7891)" /note="Surfactant protein B (SP-B) is 
25 a key component of lung surfactant, a surface active material secreted by type II 
epithelial cells of lung alveolus; SP-B maintains biophysical properties and 
physiological function of surfactant; Pulmonary surfactant associated protein" 
ORIGIN 1 makshlppwl lllllptlcg pgtavwatsp lacaqgpefvi/ cqsleqalqc kalghclqev 
61 wghvgaddic qecqdivnil tkmtkeaifq dtirkflehe cdvlplkllv pqchhvldvy 
30 121 fpltityfis qinakaicqh Iglcqpgspe ppldplpdkl viptllgalp akpgphtqdl 

181 saqrfpipip Icwicrtllk riqamipkgv iamavaqvch wplwggic qciaerytvi 
241 llevllghvl pqlvcglvir cssvdsigqv pptlealpge wipqdpecpl cmsvttqarn 
301 iseqtrpqav yhaclssqid kqeceqfvel htpqilslls rgwdaraicq algacvatis 
361 piqciqsphf . 

35 

SEQ ID NO: 112 

1901 1 76A surfactant protein A gi|382753|prf|11901 176A[382753] 
FEATURES Location/Qualifiers source 1..247 /organism-'Oryctolagus cuniculus" 
/db_xref="taxon:9986" 
40 ORIGIN 1 mlllslaltl isapasdtcd tkdvcigspg ipgtpgshgl pgrdgrdgvk gdpgppgpmg 

61 ppggmpglpg rdgligapgv pgergdkgep gergppglpa yideelqati helrhhalqs 
121 igvlslqgsm kavgekifst ngqsvnfdai revcaraggr iaivkevprs leeneaiasr 
1 81 ntyaylglae gptagdfyyl dgdpvnytnw ypgeprgqgr ekcvemytdg kwndknclqy 
241 rivicef 

45 

SEQ ID NO: 113 

CAA53510 lung surfactant protein D [Bos taurus] 
gil415939lemb|CAA53510.1|[415939] 
sig_peptide 1..20 
50 matjDeptide 21. .369 /product-'lung surfactant protein D" 

CDS 1 ..369 /coded_by="X7591 1 .1 :102..121 1" /db_xref="SWISS-PROT:P35246" 
. ORIGIN 1 mljlplsvll lltqpwrslg aemkiysqkt manactlvmc sppedgipgr dgrdgregpr 
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61 gekgdpgspg pagragmpgp agpiglkgdn gsagepgpkg dtgppgppgm 
pgpagregps 

121 gkqgsmgppg tpgpkgdtgp kggvgapgiq gspgpaglkg ergapgepga 
pgragapgpa 

5 181 gaigpqgpsg argppglkgd rgtpgergak gesglaevna Irqrvgileg qlqriqnafs 

241 qykkamlfpn grsvgekifk tvgsektfqd aqqictqagg qlpsprsgae nealtqiata 
301 qnkaaflsms dtrkegtfiy ptgeplvysn wapqepnndg gsencveifp ngkwndkvcg 
361 eqrivicef 

10 SEQIDNO:114 

CAA53511 collectin-43 [Bos taurus] gi|499385|emb|CAA53511.1I[499385] 

FEATURES Location/Qualifiers source 1..301 /organism="Bos taurus" 
/db_xref="taxon:9913" /tissue_type="liver** /cIoneJib="lambda gt 11" 
15 Protein 1.. 301 /product="collectin-43" 

mat_peptide 1..301 /product="coIlectin-43" 

CDS 1..301 /coded_by="X75912.1:<1..906" /db_xref="SWISS-PROT:P42916" 
ORIGIN 1 eemdvyxekt Itdpctlwc appadslrgh dgrdgkegpq gekgdpgppg mpgpagregp 
61 sgrqgsmgpp gtpgpkgepg peggvgapgm pgspgpaglk gergapapgg 
20 . aigpqgpsga 

121 mgppglkgdr gdpgekgarg etsvlevdtl rqrmrnlege vqriqnivtq yrkavlfpdg 
181 qavgekifkt agavksysda eqlcreakgq lasprssaen eavtqivrak nkhaylsmnd 
241 iskegkftyp tggsldysnw apgepgnrak degpenclei ysdgnwndie creerlvice 
301 f 

25 

SEQIDNO:115 

CAA46152 lung surfactant protein D [Homo sapiens] 
gi|34767|emb|CAA46i52.1|[34767] - 

30 sig_peptide 1..20 

mat_peptide 21 ..375 /product="lung surfactant protein D" 

CDS 1..375 /gene="hsp-D" /coded_by="X65018.1:172..1299" /db_xref="SWISS- 
PROT:P35247" ' 

ORIGIN 1 mllfllsalv lltqplgyle aennktyslirt tpsactlvmc ssvesglpgr dgrdgregpr 
35 61 gekgdpgipg aagqagmpgq agpvgpkgdn gsvgepgpkg dtgpsgppgp 

pgvpgpagre 

121 gplgkqgnig pqgkpgpkge agpkgevgap gmqgsagarg lagpkgergv 
pgergvpgna . 

181 gaagsagamg pqgspgargp pglkgdkgip gdkgakgesg Ipdvaslrqq vealqgqvqh 
40 241 Iqaafsqykk velfpngqsv gekifktagf vkpfteaqll ctqaggqias prsaaenaal 

301 qqiwaknea aflsmtdskt egkftyptge slvysnwapg epnddggsed dveiftngkw 
361 ndracgekrl wcef 

SEQIDNO:116 
45 AAA92788 lung surfactant protein C [Rattus norvegicus] 
gil595282|gb|AAA92788.1 1[595282] 

FEATURES Location/Qualifiers source 1..194 /organism="Rattus norvegicus" ■ 
/db_xref="taxon:1 01 1 6" /clone="sp-c" /tissue Jype="liver" 
Protein 1 ..1 94 /product="lung surfactant protein C" 
50 CDS 1 ..1 94 /gene="sp-c" 

/coded_by="join(U07796. 1 : 1 673., 1 7 1 4. U07796. 1 :2841 ..2999, 
U07796.1 :3252..3377.U07796.1 :3598..3707, U07796.1 :4053..4200)" 
ORIGIN 1 nndmgskevim esppdystgp rsqfripccp vhlkriliw vvvvlvwvi vgallmglhm 
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61 sqkhtemvle msiggapetq kiialsehtd tiatfsigst givlydyqrl Itaykpapgt 
121 ycyimkmape sipslealar kfknfqakss tptsklgqee ghsagsdsds sgrdlaflgl 
181 avstlcgvlp lyyi 

5 SEQ lb NO: 117 ' 

AAA31468 surfactant protein A [Oryctolagus cuniculus] 
gi|431446|gb|AAA31468.1|[431446l 

FEATURES Location/Qualifiers source 1..247/organism="OryGtolagus cuniculus" 
1 0 /strain="New Zealand White" /db_xref="taxon:9986" /tissue_type="liver" 
/dev_stage="adult" 

Protein 1 ..247 /product="surfactant protein A" 

CDS 1..247/coded_by="joln(L19387.1:3864..4032,L19387.1:4241..4360. 
L19387.i:5010..5087,L19387.1:5533..5909)" 
15 ORIGIN 1 mlllslaltl isapasdtcd tkdvcigspg ipgtpgshgi pgrdgrdgvk gdpgppapwa 

61 ppggmpgipg rdgligapgv pgergdkgep gergppglpa yideelqati helrhhalqs 
121 igvlslqgsm kavgekifst ngqsvnfdai revcaraggr iavprsleen eaiasivker 
181 ntyaylglae gptagdfyyl dgdpvnytnw ypgeprgqgr ekcvemytdg kwndknclqy 
241 rivicef 
20 . • 



Mannose binding lectin 

25 SEQ ID NO: 1 

NP_034897 mannan-binding lectin serine protease 2 [Mus musculus] 
gi|6754642|ref|NP_034897.1 j[6754642] 

sig_peptide 1..15 

30 mat_peptide 1 6..1 85 /product="mannan-binding lectin serine protease 2" 

Region 28..137 /region_name="Domain first found in C1r, C1s, uEGF, and bone 
morphogenetic protein" /note="CUB" /db_xref="CDD:smart00042" 
Region 28.. 134 /region_name="CUB domain" /note="CUB" 
/db_xref="CDD:pfam0043r 

35 Region 1 38.. 1 80 /region_name="Calcium-binding EGF-like domain" 
/note="EGF_CA" /db_xref="CDD:smart001 79" 
variation 172 /al!ele='T' /allele="V" /db_xref="dbSNP:31 67338" 
CDS 1..185 /gene="Masp2" /coded_by="NM_01 0767.1:32.-589" 
/db_xref="LocuslD:171 75" /db_xref="MGD:1 330832". 

40 ORIGIN 1 mrlliflgll wslvatllgs kwpepvfgri vspgfpekya dhqdrswtit appgyrlriy 
61 fthfdielsy rceydfvkis sgtkvlatic gqestdteqa pgndtfyslg psikvtfhsd 
121 ysnekpftgf eafyaaedvd ecrvslgdsv pcdhychnyl ggyycscrag yvlhqnkhtc 
181 seqsl 

45 SEQ ID NO: 2 

AAH10760 Similar to mannose binding lectin, serum (C) [Mus musculus] 
gi|1 4789670|gb|AAH1 0760.1 1[1 4789670] 

source 1.. 244 /organism="Mus musculus" /strain="FVB/N" /db_xref="taxon: 10090" 
50 /clone='"MGC:1 8500 IMAGE:421 221 6" /tissue_type="Liver, normal. 5 month old 
male mouse." /cloneJib="NCLCGAP_Li9" /iab_host="DH10B" /note="Vector: 
pCMV-SPORT6" 

Protein 1 ..244 /product="Similar to mannose binding lectin, serum (C)" 
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61 

CDS 1 ..244 /coded_by="BC01 0760. 1 : 1 92..926"' 

ORIGIN 1 msiflsflll cwtwyaet Itegvqnscp vvtcsspgin gfpgkdgrdg akgekgepgq 
61 girglqgppg kvgptgppgn pglkgavgpk gdrgdraefd tseidseiaa Irselrairn 
. 121 wvlfslsekv gkkyfvssvk kmsldrvkal csefqgsvat prnaeensai qkvakdiayl 
5 181 gitdvrvegsfedltgnrvrytnwndgepnntgdgedcvvilgngkwndvpcsdsflaic 

241 efsd 

SEQIDNO:3 

AAH21 762 mannose binding lectin, liver (A) [Mus musculus] 
10 gi|1 825601 0|gb|AAH21 762.1 1[1 8256010] 

source 1..239 /organism="Mus musculus" /strain="FVB/N" /db_xref="taxon: 10090" 
/clone="MGC:30242 IMAGE:51 32514" /tissue_type="Liver, normal. 5 month old 
male mouse." /cloneJib="NCI_CGAP_Li9" /lab_host="DH10B"./note="Vector: 
15 PCMV-SPORT6" 

Protein 1..239 /product="mannose binding lectin, liver (A)" 
/db_xref="LocuslD:1 71 94" 

CDS 1..239/coded_by="BC021762.1:76..794"/db_xref="LocuslD:17194" 

20 ORIGIN 1 mlllpllpvl Icwsvsssg sqtcedtlkt csviacgrdg rdgpkgekge pgqglrglqg 
61 ppgklgppgs vgspgspgpk gqkgdhgdnr aieeklanme aeirilkski qltnklhafs 
1 21 mgkksgkklf vtnhekmpfs kvkslctelq gtvaiprnae enkaiqevat giaflgitde 
18t ategqfmyvtggrltysnv«/k kdepnnhgsg edcviildng Iwndiscqasfkavcefpa 

25 . SEQIDNO:4 

Q9NPY3 Complement component Cl q receptor precursor (Complement component 
1, q subcomponent, receptor 1) (ClqRp) (C1qR(p)) (C1q/MBL/SPA receptor) (CD93 
antigen) (CDw93) gi|21759074|sp|Q9NPY3|CD93_HUMAN[21 759074] 

30 source 1 ..652 /organism-'Homo sapiens" /db_xref="taxon:9606 
gene 1 ..652 /gene="C1 QR1" /note="CD93" 

Protein 1..652 /gene="C1QR1" /product="Complement component Clq receptor 
precursor" 

Region 1..21 /gene="C1QRr' /region_name="Signar' 
35 Region 22. .652 /gene="C1 QR1 " /region_name="Mature chain" 
/note="COMPLEMENT COMPONENT C1Q RECEPTOR. 
Region 22 /gene="C1QRr' /region_name="Conflict" /note="T -> V (IN AA 
SEQUENCE)." 

Region 24..580 /gene="C1QR1" /region_name="Domain" /note="EXTRACELLUI-AR 
40 (POTENTIAL)." 

Region 32.. 174 /gene="C1QR1" /region_name="Domain" /note="C-TYPE LECTIN." 
Region 36 /gene="C1QR1" /region_name="Conflict" /note="C -> T (IN AA 
SEQUENCE)." 

Region 38.. 39 /gene="C1QR1" /region_name="Confllct" /note='TA -> Rl (IN AA 
45 SEQUENCE)." 

Region 155 /gene="C1QR1" /region_name="Conflict" /note="S -> N (IN REF. 1)." 
Region 186 /gene="C1QR1" /region_name="Conflict" /note="G -> A (IN AA 
SEQUENCE)." 

Region 260..301 /gene="C1QR1" /region_name="Domain" /note="EGF-LIKE 1." 
50 Bond bond(264,275)/gene="C1QR1"/bond_type="dlsulfide"/note="BY 
SIMILARITY." 

Bond bond(271,285) /gene="C1QRr /bond_type="disulfide" /note="BY 
SIMILARITY." 
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Bond bond(287.300) /gene="C1QR1" /bond_type="disulfide" /note="BY. 

Reaion 302 .344 /gene="C1QR1'7region_name="Domain /note- EGF-LIKE 2. 
. .Bond Sond(306.317) /gene="C1QRr /bond_type="disulf.de" /note="BY 

^ eL^dborldJsi 1,328) /gene="C1QR1" /bond Jype=^^^^^ . 

R^i^3lV/gene="C1QR1" /region_name="Variant" /note="V -> A. 

10 Ste 325^gil?e="?iQR1" /site_type="glycosylation" /note="N-LINKED (GLCNAC.) 

£nd bo^d(330,343) /gene="C1QR1" /bond_type=-disulfide'' /note="BY 

SIMILARITY." „^ . „ , „coc i ncf= 

Region 345..384 /gene="C1QR1" /region_name="Domain /note- EGF-LIKE 3. 

15 CALCIUM-BINDING (POTENTIAL)." 

Borid bond(349;358) /gene="C1QRr /bondJype="disulfide /note- BY 

Bondtond(354.367) /gene=-C1QR1" /bond^type=-disulflde" /note=-BY 
20 Bli^d^Sld(369.383)/gene="C1QRr/bond_type="disulfide" /note="BY 

R^i^ 3S!'.426 /gene="C1QR1" /region_name="Domaln" /note="EGF-LIKE 4. 
CALCIUM-BINDING (POTENTIAL)." 

Bond bond(389.400) /gene="C1QRr /bond_type="disuiride /note= BY 

Bl>ndlbond(396.409) /gene="C1QR1" /bond_type="disulfide" /note="BY 
SIMILARITY " 

Bond bond(41 1.425) /gene="C1QR1" /bond_type="disulfide" /note="BY 

SIMILARITY." „^ . „ , . „c^p , .^-p c 

30 Reqion 427..468 /genfe="C1 QR1" /region_name="Domain /note- EGF-LIKE 5, 
CALCIUM-BINDING (POTENTIAL)." Bond bond(431 .443) /gene="C1QRr 
/bond type="disulfide" /note="BY SIMILARITY." 
Bondbond(439.452) /gene="C1QRr /bond_type="disulfide" /note= BY 

35 Bond bond(454.467) /gene="C1 QR1 " /bond_type="dlsulfide" /note="BY 

li^io^ 4'92^/gene="C1QR1" /reglon_name="Conflict" /note="S -> A (IN AA 

Regiotf 4S^/gene="C1QRr /reglon_name="Conflict" /note="R -> Q (IN AA 

Re^on 5(Su'gene="C1 QR1 " /region_name="ConfIicr /note="R -> G (IN AA 

ReSon 54T/gene="C1QRr' /region_name="Conflicr /note="P -> S (IN REF. 1)." 
Region 581 ..601 /gene="C1QRr /region_name="Transmembrane region 

R5on^59l^6m7gene="C1QR1" /reglon_name=;Domain;; /"^J^lIP^liX'ipi^^^^^ 
. Region 602..652 /gene="C1QRr /region_natne="Domain" /note= 'CYTOPLASMIC 

. OSS' T matemgllll llllltqpga gtgadteaw cvgtac^^^ 
50' 61 atvkskeeaq hvqrvlaqll rreaaltarm skfwiglqre kgkcldpsip Ikgfswvggg 

121 edtpysnwhk eirnsciskr cvsllldlsq pllpsripkw segpcgspgs pgsniegfvc 
181 kfsfkgmcrp lalggpgqvt yttpfqttss sleavpfasa anvacgegdk detqshyflc 
241 kekapdvfdw gssgplcvsp kygcnfnngg chqdcfeggd gsflcgcrpg frilddivtc 
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301 asrnpcsssp crggatcvlg phgknytcrc pqgyqidssq Idcvdvdecq dspcaqecvn 
361 tpggfrcecw vgyepggpge gacqdvdeca Igrspcaqgc tnldgsfhcs ceegyvlage 
421 dgtqcqdvde cvgpggplcd sicfntqgsf hcgclpgwvl apngvsctmg pvslgppsgp 
481 pdeedkgeke gstvpraata sptrgpegtp katpttsrps Issdapitsa pikmlapsgs 
5 541 pgvwrepsih hataasgpqe paggdssvat qnndgtdgqk lllfyilgtv vaillllala 

601 igllvyrkrr akreekkekk pqnaadsyisw vperaesram enqysptpgt dc 

SEQIDNO:5 

069103 Compiemeht component C1q receptor precursor (Complement component 
10 1, q subcomponent, receptor 1) (ClqRp) (C1qR(p)) (C1q/MBL/SPA receptor) (CD93 
antigen) (Cell surface antigen AA4) (Lymphocyte antigen 68) 
gi|21 541 998|sp|0891 03|CD93_MOUSE[21 541 998 

source 1 ..644 /organism="Mus muscuius" /db_xref="taxon: 1 0090" 
15 gene 1 .644 /gene="C1 QR1" /note="CD93; CI QRP; LY68; AA4" 

• Protein 1.,644 /gene="C1QR1" /product="Complement component Clq receptor 
precursor" 

Region 1..22 /gene="C1QR1" /region_name="Signal" /note="POTENTIAL." 
Region 23..644 /gene="C1 QR1 " /region_name="Mature chain" 
20 /note="COMPLEMENT COMPONENT CI Q RECEPTOR." 

Region 23..572 /gene="C1QR1" /reglon_name="Domain" /note="EXTRACELLULAR 
(POTENTIAL)." 

Region 31..173 /gene="C1QR1" /region_name="Domain" /note="C-TVPE LECTIN." 
Site 102 /gene="C1QR1" /site_type="glycosylation" /note="N-LINKED (GLCNAC.) 
25 (POTENTIAL)." 

Region 257..298 /gene="C1QR1" /region_name="Domain" /note="EGF-LIKE 1 ." 
Bond bond(261 ,272) /gene="C1 QR1 " /bond_type="disulfide" /note="BY 
SIMILARITY." 

Bond bond(268,282) /gene="C1QR1" /bond_type="disulfide" /note="BY 
30 SIMILARITY." 

Bond bond(284,297) /gene="C1 QR1 " /bond_type="disulfide" /note="BY 
SIMILARITY." 

Region 299..341 /gene="C1QR1" /region_name="Domain" /note="EGF-LIKE 2." 
Bond bond(303,314) /gene="C1QR1" /bond_type="disulfide",/note="BY 
35 SIMILARITY." 

Bond bond(308,325) /gene=:"C1QRr /bond_type="disulfide" /note="BY 
SIMILARITY." ' 

Site 322 /gene="C1QR1"/site_type="glycosylation" /note="N-LINKED (GLCNAC.) 
(POTENTIAL)." 

40 Bond bond(327,340) /gene="C1 QR1 " /bond_type="disulfide" /note="BY 
SIMILARITY." 

Region 342.. 381 /gene="C1QR1" /region_hame="Domain" /note="EGF-LIKE 3, 
CALCIUM-BINDING (POTENTIAL)." 

Bond bond(346,355) /gene="C1 QR1 " /bond_type="disulfide" /note="BY 
45 SIMILARITY." 

Bond bond(351, 364) /gene="C1QR1" /bond_type="disulfide" /note="BY 
SIMILARITY." 

Bond bond(366.380) /gene="C1 QR1 " /bond_type="disulfide" /note="BY 
SIMILARITY." . 

50 Region .382. .423 /gene="C1 QR1 " /region_name="Domain" /note="EGF-LIKE 4, 
CALCIUM-BINDING (POTENTIAL)." 

Bond bond(386,397) /gene="C1QR1" /bond_type="disulfide" /note="BY 
SIMILARITY." 
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Bond bond(393,406) /gene="C1 QR1 " /bond_type="disulfide" /note="BY 
SIMILARITY." „ , \ 

Bond bond(408,422) /gene="C1QR1 " /bond_type="disulfide /note= BY 

SIMILARITY." . .. , . ..rroc I ii^n c 

5 Region 424. .465 /gene="C1 QR1 " /region_name="Domain" /note=* EGF-LIKE 5, 

CALCIUM-BINDING (POTENTIAL)." 

Bond bond(428,440) /gene="C1QR1" /bond_type="disulfide" /note= BY 

Bond bond(436.449) /gene="C1QR1" /bond_type="disulflde" /note="BY 

10 . SIMILARITY." 

Bond bond(451.464) /gene="C1QR1" /bond_type=''disulfide /note= BY 

SIMILARITY." . ... 

Region 573..593 /gene="C1QRr /reglon_name="Transmembrane region 

/note="POTENTIAL." . 
1 5 Region 594..644 /gene="C1 QR1" /region_name="Domain" /npte="CYTOPLASMIC 

(POTENTIAL)." ^ , 

ORIGIN 1 maistglfll Igllgqpwag aaadsqawc egtacytahw gklsaaeaqh rcnenggnia 
61 tvkseeearh vqqaltqilk tkapleakmg kfwiglqrek gnctyhdipm rgfswvggge 
121 dtaysnwyka sksscifkrc vslildlsit phpshlpkwh espcgtpeap gnsiegfick 
20 181 fnfkgmorpl alggpgrvty ttpfqattss leavpfasva nvacgdeaks ethyflcnek 

241 tpgifhwgss gplcvspkfg csfnnggcqq dcfeggdgsf rcgcrpgftl Iddlvtcasr 
301 npcssnpctg ggmchsvpis enytcrcpsg yqldssqvhc vdidecqdsp caqdcvnt^ 
361 sfhcecwvgy qpsgpkeeac edvdecaaan spcaqgcint dgsfycscke gyivsgedst 
421 qcedidecsd argnpcdsic fntdgsfrcg cppgwelapn gvfcsrgtvf selparppqk 
25 481 ednddrkest mpptempssp sgskdvsnra qttglfvqsd iptasvplei eipsevsdvw 

541 felgtylptt sghskpthed svsahsdtdg qnlllfyilg twaislllv lalgiliyhk 
601 rrakkeeike kkpqnaadsy swvperaesq apenqysptp gtdc 

SEQIDNO:6 . x 

30 P09871 Complement CIs component precursor (CI esterase) 
. gi|1 15205|sp|P09871 |C1S_HUMAN[1 15205] 

source 1 ..688 /organism="Homo sapiens" /db_xref="taxon:9606" 
gene 1 ..688 /gene="C1 S" ^ 
35 Protein 1 . .688 /gene="C1 S" /product="Complement C1 s component precursor- 
/EC_number="3.4.21.42" 

Region 1. .1 5 /gene="C1S"/region_name="Signal" ..™>,o. cry.t=MT 

Region 16 437 /gene="C1S" /region_name="Mature cliain" /note="COMPLEMENT 

CIS HEAVY CHAIN." Region 16..130 /gene="C1S" /region_name="Domain" 
40 /note="CUB 1." 

Bond bond(65.83) /gene="C1 8" /bond_type="disulfide 

Region 131..172 /gene="C1S" /region_name="Domain /note= EGF-LIKE, 

CALCIUM-BINDING (POTENTIAL)." 

Bond bond(135,147) /gene="C1S" /bond_type="dlsulfide" 
45 Bond bond(143,156) /gene="C1S" /bond_type="disulfide" 

Site 149 /gene="C1S" /site_type="hydroxylation" /note= (PROBABLE). 

Bond bond(158.171) /gene="C1S" /bond_type="disu.lfide" ^ 

Site 174 /gene="C1S" /site_type="glycosylation" /note=' N-LINKED (GLCNAC...). 

Region 175..290/gene="C1S" /region_name="Domain" /note-"CUB 2 ' 
50 Bond bond(175.202) /gene="C1 S" /b6nd_type="disulfide" Bond bpnd(234.251) 

/gene="C1 S" /bond_type="disulf ide" Region 293..355 /gene="C1 S 

/region_name- 'Domain" /note="SUSH1 1 ." 

Bond bond(294.341) /gene="C1S" /bond_type="disulf Ide" 
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Region 294 /gene="C1S" /region„name="Conflict" /note="C -> K (IN REF. 6)." 

Bond bond(321,354) /gene="C1S" /bond_type="disuIfide" 

Region 358..422 /gene="C1 S"7region_name="Domain" /note="SUSHI 2." 

Bond bond(359,403) /gene="C1 S" /bond_type="disulfide" 

Bond bond(386,421 ) /gene="C1 S" /bond_type="disulfide" 

Site 406 /gene="C1S" /siteJype="glycosylation" /note="N-LINKED (GLCNAC...)-" 
Bond bond(425,549) /gene="C1S" /bondJype="disulfide" /note="INTERCHAIN." 
Region 438..688 /gene="C1S" /region_name="Mature chain" /note="COMPLEMENT 
C1S LIGHT CHAIN." Region 438..688 /gene="C1S" /region_name="Domaln" 
/note="SERINE PROTEASE." 

Site 475 /gene="C1 S" /sitejype="active" /note="CHARGE REl^Y SYSTEM." 
Region 513 /gene="C1S" /region_name="Conflict" /note="G -> GG (IN REF. 5)." 
Site 529 /gene="C1S" /sitejype="active" /note="CHARGE RELAY SYSTEM." 
Region 573 /gene="C1S" /region_name="Conflicr /note="T.-> A (IN REF. 7)." 
Bond bond(595,618) /gene="C1S" /bondJype="disumde" 
Bond bond(628,659) /gene="C1 S" /bond„type="disulfide" 
Site 632 /gene="C1S" /sltejype="active" /note="CHARGE RELAY SYSTEM." 
Region 645. .646 /gene="C1S" /region_name="Conflicr /note="TK -> GR (IN REF. 
7)." 

ORIGIN 1 mwclvlfsll awvyaeptmy geilspnypq aypseveksw dievpegygi hlyfthldie 
61 isencaydsv qiisgdteeg ricgqrssnn phspiveefq vpynklqvif ksdfsneerf 
121 tgfaayyvat dinectdfvd vpcshfcnnf iggyfcscpp eyflhddmkn cgvncsgdvf 
181 taligeiasp nypkpypens rceyqiriek gfqvwtirr edfdveaads agncldslvf 
241 vagdrqfgpy cghgfpgpin ietksnaldi ifqtdltgqk kgwklryhgd pmpcpkedtp . 
301 nsvwepakak yvfrdwqit cidgfevveg rvgatsfyst cqsngkwsns kikcqpvdcg 
361 ipesiengkv edpestlfgs virytceepy yymengggge yhcagngswv nevlgpelpk: 
421 cvpvcgvpre pfeekqriig gsdadiknfp wqvffdnpwa ggalipeywv Itaahvvegn 
481 reptmyvgst svqtsrlaks kmltpelivfi hpgwkllevp egrtnfdndi alvrlkdpvk 
541 mgptvspici pgtssdynim dgdiglisgw grtekrdrav rikaarlpva pirkckevkv 
601 ekptadaeay vftpnmicag gekgmdsckg dsggafavqd pndktkfyaa givswgpqcg 
661 tyglytn/kn yvdwimktmq enstpred 

SEQ ID NO: 7 

NP_036204 complement component 1, q subcomponent, receptor 1; complement 
component Clq receptor [Homo sapiens] 
gi|6912282|ref|NP_036204.1|[6912282] 

source 1 ..652 /organism="Homo sapiens" /db_xref="taxon:9606" /chromosome="20" 
/map="20p 11.21" 

Protein 1..652 /product="complement component 1, q subcomponent, receptor 1" 
/note="complement component Clq receptor" 

Region 32.. 130 /region_name="smart00034, CLECT, C-type lectin (CTL) or 
carbohydrate-recognition domain (CRD); Many of these domains function as 
calcium-dependent carbohydrate binding modules" 

Region 47.. 128 /region_name="pfam00059, lectin_c, Lectin C-type domain. This 
family includes both long and short form C-type" 

Region 385..426/region„name="smart00179. EGF_CA, Calcium-binding EGF-like 
domain" 

CDS 1 ..652 /gene="C1QR1" /coded_by="NM_012072.2:149..2107" 
/note=:"Clq/MBL/SPA receptor" /db_xref="LocuslD;22918" /db_xref="MIM:120577" 
ORIGIN 1 matsmgllll llllltqpga gtgadteaw cvgtacytah sgklsaaeaq nhcnqnggnl 
61 atvkskeeaq hvqrvlaqll rreaaltarm skfwiglqre kgkcldpsip Ikgfswvggg 
121 edtpysnwhk eirnsciskr cvsllldlsq pllpnrlpkw segpcgspgs pgsniegfvc 
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181 kfsfkgmcrp lalggpgqvt yttpfqttss sleavpfasa anvacgegdk detqshyfic 
241 kekapdvfdw gssgplcvsp kygcnfnngg chqdcfeggd gsflcgcrpg frilddlvtc 
301 asmpcsssp crggatcvig phgknytcrc pqgyqidssq Idcvdvdecq dspcaqecvn 
361 tpggfrcecw vgyepggpge gacqdvdeca Igrspcaqgc tntdgsfhcs ceegyvlage 
. 5 421 dgtqcqdvde cvgpggplcd sicfntqgsf hcgclpgwvl apngvsctmg pvslgppsgp 

481 pdeedkgeke gstvpraata sptrgpegtp katpttsrps Issdapitsa pikmlapsgs 
541 sgvwrepsih hataasgpqe paggdssvat qnndgtdgqk lllfyilgtvvaillllala 
601 igllvyrkrr akreekkekk pqnaadsysw vperaesram enqysptpgt dc 

10 SEQ ID NO: 8 ^. . 

NP_000233 soluble mannose-binding lectin precursor; mannose-biriding lectin; 
mannose binding protein; Mannose-binding lectin 2, soluble (opsonic defect) [Homo 
sapiens] 

gi|4557739|ref |NP_000233. 1 1[4557739] 

15 

sig_peptide 1..20 

mat_peptide 21 ..248 /product="solub!e mannose-binding lectin" 
variation 54 /allele="D" /allele="G" /db_xref="dbSNP: 1 800450" 
variation 57 /allele="E" /allele="G" /db_xref="dbSNP:1 800451" 
20 Region 1 34.-245 /region_name="smart00034. CLECT, C-type lectin (CTL) or 
carbohydrate-recognition domain (CRD); Many of these domains function as 
calcium-dependent carbohydrate binding modules" 

Region 144..246 /region_name="pfam00059, Iectin_c, Lectin C-type domain. This 
family includes both long and short forrh C-type" 
25 CDS 1 ..248 /gene="MBL2" /coded_by="NM_000242.1 :66..812" 
/db_xref="LocuslD:41 53" /db_xref="MIM:1 54545" 

ORIGIN 1 mslfpslpll llsmvaasys etvtcedaqk tcpaviacss pgingfpgkd grdgtkgekg 

61 epgqglrglq gppgklgppg npgpsgspgp kgqkgdpgks pdgdsslaas erkalqtema 
1 21 rikkwitfsl gkqvgnkffi tngeimtfek vkalcvkfqa svatpmaae ngaiqniike 
30 181 eaflgitdek tegqfvditg nrltytnwne gepnnagsde dcvlllkngq wndvpcstsh 

241 lavcefpi 

SEQ ID NO: 9 . 

P1 1226 Mannose-binding protein C precursor (MBP-C) (MBP1) (Mannan-binding 
35 protein) (Mannose-binding lectin) gi|1 26676|sp|P1 1 226|MABC_HUMAN[1 26676] 

source 1..248 /organism="Homo sapiens" /db_xref="taxon:9606" 
gene 1 ..248 /gene="MBL2" /note="MBL" 

Protein 1..248 /gene="MBL2" /product="Mannose-binding protein C precursor" 
40 Region 1.. 20 /gene="MBL2"/region_name="Signal" 

Region 21. .248 /gene="MBL2" /region_name="Mature chain" /note="MANNOSE- 
BINDING PROTEIN C." Region 21 ..41 /gene="MBL2" /region_name="Domain" 
/note="CYS-RICH." 

Region 24 /gene="MBL2" /reglon_name="Varianr /note="T -> A (IN CHINESE). 
45 /FTId=VAR_013294." 

Region 42..99 /gene="lVIBL2" /region_name="Domain" /note="COLLAGEN-LIKE. 
Site 47 /gene="MBL2" /site_type="hydroxylation" 

Region 52 /gene="MBL2" /region_name="Variant" /note="R -> C (IN 0.05% OF 
EUROPEAN AND AFRICAN POPULATIONS). /Fnd=VAR_008543." 
50 Region 54 /gene="MBL2" /region_name="Variant" /note="G -> D (IN CAUCASIAN 
AND CHINESE POPULATIONS). /FTId=VAR_004182." 
Region 57 /gene="MBL2" /region_name="Variant" /note="G -> E (IN WEST 
AFRICAN POPULATION). /FTId=VAR_0041 83." 
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Site 73 /gene="MBL2" /site_type="hydroxylation" 
Site 79 /gene="MBL2" /site_type="hydroxylation" 
Site 82 /gene="i\/IBL2" /slte_type="hydroxylatiDn" 
Site 88 /gene="!\/lBL2" /site_type="liydroxylation" 
5 Region 109 /gene="MBL2" /region_name="Hydrogen bonded turn" 
Region 1 1 0. . 1 29 /gene="MBL2" /region_name="Helical' region" 
Region 130 /gene="IVIBL2" /region_name="Hydrogen bonded turn" 
Region 132..134 /gene="i\/lBL2" /region_name="Beta-strand region" 
Region 1 35..1 36 /gene="MBL2" /region_name="Hydrogen bonded turn" 
1 0 Region 1 37.. 1 47 /gene="!\/lBL2" /region_name="Beta-strand region" 
Region 148.. 157 /gene-'MBL2" /region_name="Helical region" 
Region 1 53..246 /gene="MBL2" /region_name="Domain" /note="C-TYPE LECTIN 
(SHORT FORiVi)." 

Bond bond(1 55,244) /gene="i\/IBL2" /bond_type="disulfide" 

15 - Region 158.. 159 /gene="MBL2" /region_name="Hydrogen bonded turn" 
Region 161. .162 /gene="l\/IBL2" /region_name="Beta-strand region" 
Region 168.. 177 /gene="MBL2" /region_name="Heiical region" 
Region 1 82..1 87 /gene="MBL2" /reglon_name="Beta-strand region" 
Region 192..193 /gene="MBI_2" /region_nanne="Hydrogen bonded turn" 

20 Region 196.. 197 /gene="i\/lBL2" /region_nanfie="Beta-strand region" 

Region 1 98..1 99 /gene="MBL2" /region_name="Hydrogen bonded turn" 
Region 202 /gene="!WBL2" /region_name="Beta-strand region" 
Region 208 /gene="l\/IBL2" /region_name="Beta-strand region" 
Region 210..211 /gene="i\/lBL2" /region_name="Hydrogen bonded turn" 

25 Region 21 6.. 21 8 /gene="l\/IBL2" /region_name="Helical region" 
Bond bond(222,236) /gene="MBL2" /bond_type="disulfide" 
Region 222:.225 /gene="MBl-2" /region_name="Beta-strand region" 
Region 227..228 /gene="l\/IBL2" /region_name="Hydrogen bonded turn" 
Region 231 ..234 /gene="IVIBL2" /region_name=" Beta-strand region" 

30 Region 236..237 /gene="!\/IBL2" /region_name="Hydrogen bonded turn" 
Region 239..248 /gene="[VIBL2" /region_name="Beta-strand region" 

ORIGIN 1 mslfpslpll llsmvaasys etvtcedaqi< tcpaviacss pgingfpgi<d grdgtl<gel<g 

61 epgqglrglq gppgl<lgppg npgpsgspgp kgqi<gdpgl<s pdgdsslaas erkalqtema 
35 121 ril<kwitfsl gkqvgnkffl tngeimtfek vkalcvkfqa svatprnaae ngaiqniike 

181 eaflgitdek tegqfvditg nrltytnwne gepnnagsde dcvlllkngq wndvpcstsh 
241 lavcefpi 

SEQIDNO:10 ' 
40 Q9ET61 Complement component Clq receptor precursor (Complement component 
1, q subcomponent, receptor 1) (C1qRp) (C1qR(p)) (C1q/MBL/SPA receptor) (CD93 
antigen) (Cell surface antigen AA4) gi|21541989|sp|Q9ET61|CD93_RAT[21541989] 

source 1..643 /organism="Rattus norvegicus" /db_xref="taxon:10116" 
45 gene 1 ..643 /gene="C1QR1" /note="CD93; C1QRP2" . 

Protein 1..643 /gene="C1QR1" /product="Complement component C1q receptor 
precursor" 

■ Region 1.. 23 /gene="C1QR1" /region_name="Signal" /note="POTENTIAL." 
Region 24..643 /gene="C1QRr/region_name="Mature chain" . 
50 /note="COMPLEi\/IENT COIVIPONENT CI Q RECEPTOR." 

Region 24..571 /gene="C1QRr /region_name="Domain" /note="EXTRACELLULAR 
(POTENTIAL)." 

Region 31..173 /gene="C1QRr /region_name="DGmain" /note="C-TYPE LECTIN." 
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Region 257. .298 /gene="C1QR1" /region_name="Domain" /n6te=VEGF-LIKE 1 
Bond bond(261.272) /gene="C1QRr /bond_type="disulfide" /note="BY 

SIMILARITY." „ , ^ „_-. 

Bond bond(268,282) /gene="C1QR1"/bond_type="disulfide" /note= BY 

5 SIMILARITY." « , * »dv 

Bond bond(284.297) /gene="C1QRr /bond_type="disulfide /note= BY 

SIMILARITY." . „/ ♦ "ci-cMk-FO"" 

Region 299..341 /gene="C1QR1"/region_name="Domain"/note= EGF-LIKE2. 

Bond bond(303,314) /gene="C1QRr /bond_type="disulfide" /note="BY 
^° BOTdbond(308,325) /gene="C1QR1" /bond_type="disulfide" /note="BY 

s!te'322^/gene="C1QR1" /site_type="glycosylation" /note="N-LlNKED (GLCNAC.) 

(POTENTIAL)." ... , 

1 5 Bond bond(327.340) /gene="C1 QR1 " /bond_type=-disulfide' /note= BY 

SIMILARITY." ..^ . „ , * »r-oc I lu-c n 

Region 342..381 /gene="C1QR1"/region_name="Domain" /note= EGF-LIKE 3, 

CALCIUM-BINDING (POTENTIAL)." 

Bond bond(346,355)/gene="C1QR1"/bond_type="disulfide /note= BY 

20 SIMILARITY." ^ 

Bond bond(351.364) /gene="C1QR1" /bond_type="disulfide" /note= BY 

SIMILARITY." . ^ 

Bond bond(366.380) /gene="C1QR1" /bond_type="disulfide /note= BY 

SIMILARITY." ..r. • .. / * «cznc i lUfz a 

25 Region 382. .423 /gene="C1 Q.R1 " /region_name="Domain' /note= EGF-LIKE 4, 

CALCIUM-BINDING (POTENTIAL)." 

Bond bond(386.397) /gene="C1 QRI" /bond_type="disulfide" /note= BY 

Bondbond(393,406) /gene="C1QR1" /bond_type="disulfide" /note="BY 

30 .SIMILARITY." „ , ^ 

Bond bond(408.422) /gene="C1 QR1" /bond_type="disulfide" /note= BY 

RMi^?iV/gene="C1QR1" /region_name="Conflict" /note="E -> K (IN REF. 2)." 
Region 424..462 /gene="C1QR1" /region_name="Domain" /note="EGF-LIKE 5, 
35 CALCIUM-BINDING (POTENTIAL)." 

Bond bond(428.437) /gene="C1QRr /bondJype="disulfide" /note= BY 

SIMILARITY." ,^^„, ^ 

Bond bond(433.446) /gene="C1QR1"/bond_type="disulfide /note= BY 

SIMILARITY." „ , , „ov 

40 Bond bond(448.461) /gene="C1QR1" /bond_type="disulfide /note= BY 

Site 49r/gJne="C1QR1" /site_type="glycosylation" /note="N-LINKED (GLCNAC.) 

(POTENTIAL)." _ 

Region 572..592 /gene="C1QRr /reglon_name="Transmembrane region 

Re^on Sgl^eis' /gene="C1 QR1" /region_name=" Domain" /note="CYTOPLASMIC 
(POTENTIAL)." 



50 



ORIGIN 1 mvtstgllll Igllgqiwag aaadseavvc egtacytahw gklsaaeaqh rcnenggnia 
61 tvkseeearh vqealaqlll< tkapsetkig kfwiglqrek gkctyhdipm kgfswvggge 
121 dttysnwyka skssciskrc vslildlslk plipshlpkwh espcgtpdap gnsiegfick 
181 fnfkgmcspl alggpgqity ttpfqattss Ikavpfasva nwcgdeaes ktnyyicket 
241 tagvfhwgss gplcvspkfg csfnnggcqq dcfeggdgsf rcgcrpgfrl Iddlvtcasr 



SUBSTITUTE SHEET (RULE 26) 



' wo 2004/024925 




PCT/DK2003/000585 



' 301 npcssnpctg ggmchsvpls enytchcprg yqldssqvhc vdidecedsp cdqecintpg 
361 gfhcecwvgy qssgskeeac edvdectaay spcaqgctnt dgsfycscke gyimsgedst 
421 qcedideclg npcdtlcint dgsfrcgcpa gfelapngvs ctrgsmfsel.parppqkedk 
481 gdgkestvpl tempgslngs kdvsnraqtt disiqsdsst asvpleievs seasdvwidi 
5 541 gtylpttsgh sqpthedsvp ahsdsdtdgq klllfyilgt vvaislllal alglliylkr 

601 kakkeeikek kaqnaadsys wiperaesra penqysptpg tdc 

SEQIDNO:11 

NP_006601 mannan-binding lectin serine protease 2, isoform 1 precursor; MBL- 
10 associated plasma protein of 19 kD; small MBL-associated protein [Homo sapiens] 
gi|21 264363|ref |NP_006601 .2|[21 264363] 
sig_peptide 1..15 

mat_peptide 16..444 /product="mannan-binding. lectin serine protease 2, isoform 1, 
cfiainA" 

15 Reigion 28..136 /region_name="Domain first found in CI r, CIs, uEGF, and bone 
morphogenetic protein." /note=*'CUB" /db„xref="CDD:smart00042" 
Region 28..1 34 /region_name="CUB domain" /note="CUB" 
/db_xref="CDD:pfam00431 " 

Region 138..180 /region_name="Calcium-binding EGF-IIke domain" 
20 /note="EGF__CA" /db_xref="CDD:smart001 79" 

variation 1 55 /allele="H" /alleie="R" /db_xref="dbSNP:2273343" 

Region 184..295 /region_name="Domain firstfound in Cir, C1s, uEGF, and bone 

morphogenetic protein." /note="CUB" /db_xref="CDD:smart00042" 

Region 1 84. .293 /region_name="CUB domain" /note=^"CUB" 
25 /db_xref="GDD:pfam0043r 

Region 300.. 361 /region_name="Domain abundant in complement control proteins" 

/note="CCP" /db_xref="CDD:smart00032" 

Region 300..361 /region_name="Sushi domain (SCR repeat)" /note="sushi" 
/db_xref="CDD:pfam00084" 
30 Region 366. .430 /region_name="Domain abundant in complement control proteins" 
/note="CCP" /db„xref="CDD:smart00032" 

Region 366. .430 /region_name="Sushi domain (SCR repeat)" /note="sushi" 
/db_xref="CDD:pfamOD084" 

variation 377 /allele="A" /allele="V" /db_xref="dbSNP:2273346" 
35 Region 444. .679 /region_name="Trypsin-like serine protease" /note="Tryp__SPc" 

/db_xref="C'DD:smart00020" mat_peptide 445..686 /product="mannan-binding lectin 

serine protease 2, isoform 1, chain B" 

Region 445. .679 /region_name="Trypsin" /note="trypsin" 

/db_xref=:"eDD:pfam00089" 
40 ' CDS 1 ..686 /gene="MASP2" /coded_by="NM_006610,2:22..2082" 

/db_xref="LocusID: 1 0747" /db_xref="MIM:6051 02" 

ORIGIN 1 mrlltllgll cgsvatplgp kwpepvfgri aspgfpgeya ndqernArtlt appgyrlrly 
61 fthfdielsh Iceydfvkls sgakvlatic gqestdtera pgkdtfyslg sslditfrsd 

45 121 ysnekpftgf eafyaaedid ecqvapgeap tcdhhchnhl ggfycscrag yvlhrnkrtc 

181 salcsgqvft qrsgelsspe yprpypkiss ctysisleeg fsvildfves fdvethpetl 
241 cpydflkiqt dreehgpfcg ktlphrietk sntvtitfvt desgdhtgwk ihytstaqpc 
301 pypmappngh vspvqakyil kdsfsifcet gyellqghip Iksftavcqk dgswdrpmpa 
361 cslvdcgppd dipsgrveyi tgpgvttyka viqysceetf ytmkvndgky vceadgfwts 

50 421 skgekslpvc epvcglsart tggriyggqk akpgdfpwqv lilggttaag allydnwvit 

481 aahavyeqkh dasaldirmg tlkrlsphyt qawseavfih egythdagfd ndialikinn 
541 kwinsnitp iciprkeaes fjnrtddigta sgwgltqrgf larnlmyvdi pivdhqkcta 
601 ayekppyprg svtanmlcag lesggkdscr gdsggalvfl dseterwfvg givswgsmnc 
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661 geagqygvyt kvinyipwie niisdf 



SEQ ID NO: 12 

NP_631947 mannan-binding lectin serine protease 2, isoform 2 precursor; MBL- 
5 associated plasma protein of 19 kD; small MBL-associated protein [Homo sapiens] 
gi|21 264361 |ref|NP_631947.1 1[21264361] 
sig_peptide 1..15 

mat_peptide 16.. 185 /product="mannan-bindjng lectin serine protease 2, Isoform 2" 
Region 28..1 36 /region_name="Domain first found in Cir, CIs, uEGF, and bone 
1 0 morphogenetic protein." /note="CUB" /db_xref="CDD:smart00042" 
Region 28.. 1 34 /reglon_name="CUB domain" /note="CUB" 
/db_xref="CDD:pfam00431" 

Region 138.. 180 /region_name="Calcium-binding EGF-like domain" 
/note="EGF_CA" /db_xref="CDD:smart001 79" 
1 5 variation 1 55 /allele="H" /alle!e="R" /db_xref ="dbSNP:2273343" 
CDS 1..185/gene="MASP2" /coded_by=''NM_1 39208.1 :22..579" 
/db_xref="LocuslD:1 0747" /db_xref="MIM:6051 02" 
ORIGIN . 1 mrlltllgll cgsvatplgp kwpepvfgri aspgfpgeya ndqen^lt appgyrirly 
61 fthfdlelsh Iceydfvkis sgakvlatlc gqestdtera pgkdtfyslg sslditfrsd. 
20 121 ysnekpftgf eafyaaedid ecqvapgeap tcdhhchnhl ggfycscrag yvlhrnkrtc 

181 seqsl 

SEQ ID NO: 13 

NP_624302 mannan-binding lectin serine protease 1 , isofonn 2, precursor; 
25 protease, serine, 5 (mannose-binding protein-associated); manan-binding lectin 
serine protease-1; Ra-reactive factor serine protease p100 [Homo sapiens] 
gi|21 264359|ref|NP_624302. 1 1[21 264359] 

sig_peptide 1..19 

30 mat_peptide 20..445 /product="mannan-binding lectin serine protease 1 . isoform 2, 
cinain A" 

variation 21 /allele=«T /allele="T" /db_xref="dbSNP: 1062049" 
Region 23.. 138 /region_name- 'Domain first found in dr. C1s, uEGF, and bone 
morphogenetic protein." /note="CUB" /db_xref="CDD:smart00042" 
35 Region 23. .1 35 /region_name="CUB domain" /note=".CUB" 
/db_xref="CDD:pfam00431 " 

Region 139.. 181 /region_name="Calcium-binding EGF-like domain" 

/note="EGF_CA" /db_xref="CDD:smart00179" 

Region 1 85. .294 /region_name="CUB domain" /note="CUB" 

40 /db_xref="CDD:pfam00431" 

Region 1 90..296 /region_name="Domain first found in C1 r, CI s, uEGF, and bone 
morphogenetic protein." /note="CUB" /db_xref="CDD:smart00042" 
variation 235 /allele="Q" /allere="E" /db_xref="dbSNP:3203210" 
variation 258 /allele="P" /allele="A" /db_xref="dbSNP:866085" 

45 Region 301 ..362 /region_name="Domain abundant in complement control proteins" 
/note="CCP"/db_xref="CDD:smart00032" 

Region 301.. 362 /region_name="Sushi domain (SCR repeat)" /note="sushi" 
/db_xref="CDD:pfam00084" 

Region 367. .432 /region_name="Domain abundant in complement control proteins" 
50 /n'ote="CCi^"/db_xref="CDD:smart00032" 

"Region 367..4327region_name="Sushi domain (SCR repeat)" /note="sushi" 
/db_xref="CDD:pfam00084" mat_peptide 446..728 /product="mannan-binding lectin 
serine protease 1 , isoform 2, chain B" 
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Region 449..711 /region_name="Trypsin-like serine protease" /note='Tryp_SPc" 
/db_xref="CDD:smart00020" Region 450.. 71 1 /region_name="Trypsin" 
- /note="trypsin" /db_xref="CDD:pfam00089" 

variation 61 6 /ailele="A" /aIlele="V* /db_xref="dbSNP:2461 280" 
5 Region 661 ..703 /region_name="lmmunoglobulin A1 protease" /note="IGA1" 
/db_xref="CDD:pfam02395" 

CDS 1..728 /gene="MASP1" /coded_by="NM_139125.1:51..2237" 
/db_xref="LocuslD:5648"/db_xref="MIM:600521" 

10 ORIGIN 1 mnA/illyyal cfslskasah tvelnnmfgq iqspgypdsy psdsevtwni tvpdgfriki 
61 yfmhfniess ylceydyvkv etedqvlatf cgrettdteq tpgqewisp gsfmsitfrs 
121 dfsneerftg fdahymavdv deckeredee Iscdhychny iggyyescrf gyilhtdnrt 
181 crvecsdnlf tqrtgvitsp dfpnpypkss eclytielee gfmvniqfed ifdiedhpev 
241 pcpydyikik vgpkvlgpfc gekapepist qshsvlilfh sdnsgenrgw risyraagne 

15 301 cpelqppvhg kiepsqakyf fkdqvlvscd tgykvlkdnv emdtfqiecl kdgtwsnkip 

. 361 tckivdcrap gelehglitf strnnlttyk seikyscqep yykmlnnntg iytcsaqgvw 
421 mnkvlgrsip tclpecgqps rslpslvkri iggrnaepgl fpwqallvve dtsrvpndkw 
481 fgsgallsas wiltaahvir sqrrdttvip vskehvtvyl glhdvrdksg avnssaarw 
541 Ihpdfniqny nhdialvqiq epvplgphvm pvclprlepe gpaphmlglv agwgisnpnv 

20 601 tvdeiissgt rtlsdvlqyv kipvvphaec ktsyesrsgn ysvtenmfca gyyeggkdtc 

661 Igdsggafvi fddlsqrww qglvswggpe ecgskqvygv ytkvsnyvdw vweqmglpqs 
721 wepqver 

SEQIDNO:14 

25 NP_001870 mannan-binding lectin serine protease 1, isoform 1, precursor; 

protease, serine, 5 (mannose-binding protein-associated); manan-binding lectin 
serine protease-1 ; Ra-reactive factor serine protease pi 00 [Homo sapiens] 
gi|21264357|ref|NP_001870.3|[21264357] 

sig_peptide 1..19 

mat_peptide 20..448 /product="mannan-binding lectin serine protease 1. isoform 1, 
chain A" 

variation 21 /allele="l" /allele="T" /db_xref="dbSNP:1 062049" 
Region 23..1 38 /reglon_name="Domain first found In Cir. CIs, uEGF, and bone 
morphogenetic protein." /note="CUB" /db_xref="CDD:smart00042" 
Region 23.:135 /region_name="CUB domain" /note="CUB" 
/db_xref="CDD:pfam00431 " 

Region 139..181 /region_name="Calcium-binding EGF-like domain" 
/note="EGF_CA" /db_xref="CDD:smart00179" 
Region 185..294 /region_name="CUB domain" /note="CUB" 
/db_xref="CDD:pfam0043r 

Region 190..296 /region_name="Domain first found in Cir, C1s, uEGF, and bone 
morphogenetic protein." /note="CUB" /db_xref="CDD:smart00042" 
variation 235 /allele="Q" /a!lele="E" /db_xref="dbSNP:3203210" 
variation 258 /ailele="P" /allele="A" /db_xref="dbSNP:866085" 
Region 301. .362 /region_name="Domain abundant in complement control proteins" 
/note="CCP" /db_xref="CDD:smart00032" 

Region 301. .362 /region_name="Sushi domain (SCR repeat)" /note="sushi" 
/db_xrQf="Cpp:pfam00084" 

Region 367..432 /region_name="Domain abundant in complement control proteins" 
/note="CCP" /db_xref="CDD:smart00032" 

Region 367.. 432 /region_name="Sushl domain (SCR repeat)" /note="sushi" 
/db_xref="CDD:pfam00084" 



30 



35 



40 



45 



50 
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Reaion 448 691 /region name="Trypsin-like serine protease" /note="Tryp SPc" 
Kref="CDS^ mat .peptide 449..699 /product="mannan-bind.ng lectin 
serine protease 1, isoform 1, chain B" 
Region 449. .691 /region_name='Trypsln" /note="trypsin 

/db xref="CDD:pfam00089"- „ ♦^_«lnA^" 

Region 644. .675 /region_name=''lmmunoglobulinA1 protease /note- IGAl 

/db xref="CDD:pfam02395" o^cn.. 
CDS 1 699 /gene="MASPr /coded_by:^"Niy4_001879.3:51..2150 
/db_xref="LocuslD:5648'7db_xref="MIM:600521" 

ORIGIN 1 mnwlllyyal cfslskasah tvelnnmfgq iqspgypdsy psdsevtwni tvpdgfriki 
61. yfmhfniess ylceydyvkv etedqvlatf cgrettdteq tpgqewlsp gsfmsitfrs 
121 dfsneerftg fdahymavdv deckeredee Iscdhychny iggyycscrf gyilhtdnrt 
181 crvecsdnlf tqrtgvitsp dfpnpypkss eclytielee gfmvniqfed ifdiedhpev 
15 241 pcpydyikik vgpkvigpfcgekapepistqshsvliifh sdnsgenrgw risyraagne 

301 cpelqppvhg kiepsqakyf fkdqvlvscd tgykvlkdnv emdtfqiecl kdgtwsnkip 
361 tckivdcrap gelehgiitf strnnlttyk seikyscqep yykmlnnntg lytcsaqgvw 
421 mnkvlgrsip tclpvcglpk fsrklmarif ngrpaqkgtt pwiamjsiiln gqPTcggsIl 
481 gsswivtaah clhqsldped ptirdsdils psdfkiilgk hwrlrsdene qhlgvkhtti 
20 541 hpqydpntfe ndvalvelle spvlnafvmp icipegpqqe gamv.vsgwg kqflqrfpet 

601 Irneielpivd hstcqkayap Ikkkvtrdmi cagekeggkd acagdsggpm vtlnrergqw 
661 ylygtvswgd dcgkkdrygv ysyihhnkdw iqrvtgvrn 

25 XP^I 22683 similar to mannose binding lectin, liver (A) [IVlus musculus] 
gi|20872845|ref|XP_122683.1 1120872845]. ,„ 
source 1 ..239 /organism="lVlus musculus" /strain= C57BUbJ 
/db xref="taxon:10090"/chromosome="14" . 
Protein 1 239 /product="similar to mannose binding lectin, liver (A) 

30 Region 126..236 /region_name="C-type lectin (CTL) or carbohydrate-recognition 
domain (CRD)" /note="CLECT"/db_xref="CDD:smartG0034" 
Region 1 35..237 /region_name="Lectin C-type domain /note= lectin_c 
/db xref="CDD:pfam00059" , 
CDS 1 .239/gene="Mbl1"/coded_by="XM_122683.1:10..729 

35 /db xref="LocuslD:17194"/db_xref="MGD:96923" 

ORTgIN 1 mlllpllpvl Icwsvsssg sqtcedtlkt csviacgrdg rdgpkgekge pgqglrglqg 

61 ppgklgppgs vgspgspgpk gqkgdhgdnr xxxxxxxxxx xxxxx)oaxx x)ocxxxhafs 
1 21 mgkksgkklf vtnhekmpfs kvkslctelq gtvaipmae enka.qevat giaflgitde 
181 ategqfmyvtggrltysnwk kdepnnhgsg edcviildng Iwndiscqas f kavcefpa 



SEQIDNO:16 ^ i- i 

AAM21 196 C-type mannose-binding lectin [Oncorhynchus mykissj 
gi|20385163|gbi/\AM21 1?6.1|AF363271_1[20385163] 

45 source 1 ..1 85 /organism="Oncorhynchus mykiss" /db_xref="taxon:8022" 
Protein 1.. 185 /product="C-type mannose-binding lectin" 
CDS 1..185 /gene="MBL" /coded_by="AF3e3271.1:25..582" 



50 



ORIGIN 1 meklaillll sasialgdan Itqllglepl Iktkveqttp eaqveavqeg ikegscpsdw 

61 ytygshcfkf vsiqqsfvds eqnclalggn lasvhslley qfmqaltkda nghlhstwig 
121 gfdaikegtw mwsdgsrfdy tnwdtdepnn agegedclhm naasaklwfd vpcewkfasi 



181 csrrm 
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SEQIDNO:17 

AAD45377 mannose-binding lectin [Sus scrofa] 
gi|5566370|gb|AAD45377.1 |AF164576_1 [5566370] 

5 source 1 ..240 /organism="Sus scrofa" /db_xref="taxon:9823" /tissuejype="liver" 
Protein 1,.240 /product-'mannose-blnding lectin" 
CDS 1..240 /coded„by="AF164576.1:1..723" 

ORIGIN 1 mslfpslhll Ilivmtasht etencediqn tclviscdsp ginglpgkdg Idgakgekge 
10 61 pgqgliglqg Ipgmvgpqgs pgipglpglk gqkgdsgidp gnslanlrse Idnikkwiif 

121 aqgkqvgkkl yitngkkmsf ngvkalcaqf qasvatptns renqaiqela gteaflgitd 
ISI eytegqfvdl tgkrvryqnw ndgepnnads aehcveilkd gkwndifcss qlsavcefpa 

SEQ ID NO: 18 

15 NP_034905 mannose binding lectin, liver (A) [Mus musculus] 
gi|6754654|ref |NP_034905. 1 1[6754654] 

source 1..239 /organism="Mus musculus" /db_xref="taxon: 10090" 
/chromosome="14" /nnap="14 15.0 cM" 
20 Protein 1..239 /product-'mannose binding lectin, liver (A)" 

. . misc_Jeature 19.. 239 /partial /note="mature protein based on homology to rat MPB- 
A" . 

Region 126..236 /region_name="C-type lectin (CTL) or carbohydrate-recognition 
domain (CRD)" /note="CLECT" /db_xref="CDD:smart00034" 
25 Region 1 35.. 237 /region_name="Lectin C-type domain" /note="lectin_c" 
/db_xref="CDD:pfam00059" 

CDS 1..239 /gene="Mbl1" /coded_by="NM_01 0775.1:121. .840" 
/db_xref="LocuslD:1 7194" /db_xref="MGD:96923" 

ORIGIN 1 mlllpllpvl Icvvsvsssg sqtcedtlkt csviacgrdg rdgpkgekge pgqglrglqg 
30 61 ppgklgppgs vgspgspgpk gqkgdhgdnr aieeklanme aeirilkski qltnklhafs 

121 mgkksgkklf vtnhekmpfs kvkslctelq gtvaiprnae enkaiqevat giaflgitde 
. 181 ategqfmyvt ggrltysnwk kdepnnhgsg edcvHIdng Iwndiscqas fkavcefpa 

SEQ ID NO: 19 

35 NP|_034906 mannose binding lectin, serum (C) [Mus musculus] 
gi|6754656|ref |NP_034906. 1 1[6754656] 

source 1 ..244 /organism="Mus musculus" /strain="BALB/c" 

/sub_species=="domesticus" /db_xref="taxon:T0090" /chromosome="19" /map="1 9 
40 25.0 cM" /clone="a10" /tissue_type="liver" /clone Jib="lambda gtIO" 
Protein 1..244 /product="mannose binding lectin, serum (C)" 
sig_peptide 1..18 

Region 120..241 /region_name="C-type lectin (CTL) or carbohydrate-recognition 
domain (CRD)" /note="CLECT" /db_xref="CDD:smart00034" 
45 Region 140.. 242 /region_name="Lectin C4ype domain" /note="lectin^c" 
/db_xref="CDD:pfam00059" 

CDS 1..244 /gene="Mbl2" /coded_by="NM_01 0776.1 :177..911" 
/note="polysaccharide-binding component of RaRF; sequence similarity to 
mannose-binding proteins" /db_xref="LocuslD:17195" /db_xref="MGD:96924" 



50 



ORIGIN 1 msiftsflll cvvtvvyaet Itegvqnscp wtcsspgln gfpgkdgrdg akgekgepgq 
61 girglqgppg kvgptgppgn pglkgavgpk gdrgdraefd tseidseiaa Irselrairn 
121 wvlfslsekv gkkyfvssvk kmsldrvkal csefqgsvat prnaeensai qkvakdiayl 
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181 gitdvrvegs fedl'tgnrvr ytnwndgepn ntgdgedcw ilgngkwndv pcsdsflaic 
241 efsd 

SEQIDNO:20 

5 AAL14428 dendritic cell-specific ICAM-3 grabbing nonintegrin [Macaca nemestrina] 
gi|1 61 1 8455|gb|AAL14428.1 |AF343727„1 [1 61 1 8455] 

source 1..381 /organism="Macaca nemestrina" /db_xref="taxon:9545" 
/cell_type=''peripheral blood-derived dendritic cells" 
10 Protein 1 ..381 /product="dendritic cell-specific ICAIVI-S grabbing nonintegrin" 
/name="membrane-associated mannose binding lectin" 
CDS 1 ..381 /coded_by="AF343727.1 :1 ..1 146" /note="DC-SIGN" 

ORIGIN 1 msdskeprlq qldlleeeql ggvgfrqtrg ykslagcigh gplvlqllsf tllagllvqv 
15 . 61 skvpsslsqg qskqdaiyqn Itqikvavse Isekskqqei yqeltrikaa vgelpekskq 

121 qeiyeeltri raavgelpek sklqeiyqel trikaavgel pekskqqeiy qelsrikaav 
181 gdlpekskqq eiyqkltqik aavdglpdrs kqqeiyqeli qlkaaveric hpcpwe\A^ 
241 qgncyfmsns qrnwhdsita cqevgaqlw Iksaeeqnfl qlqssrsnrf twmglsdinh 
301 egtwq\AA/dgs pllpsfkqyw nkgepnnvge edcaefsgng wnddkcniak 
20 fwickksaas 

361 csgdeerils papttpnppp a 

SEQ ID NO: 21 

AAF63470 mannose binding-like lectin precursor [Carassius auratus] 
25 ' gil7542474|gb|AAF63470.1|AF227739_1 [7542474] 

source 1..246 /organism="Carassius auratus" /db_xref="taxon:7957" 
/tissue__type="liver" 

Protein <1 .,246 /product="mannose binding-like lectin precursor" /name="collectin" 
sig_peptide <1..13 
30 Region 14.. 25 /region_name="N-terminal segment" 

Region 26..93 /region_name="collagen-like structure" 

Region 60..63 /region_name="break in collagen structure" Region 94.. 124 

/region_name="neck region" 

Region 125..246 /region_name="carbohydrate recognition domain" /note="CRD" 
35 CDS 1..246 /gene="MBL" /coded„by="AF227739.1:<1..742"/note="collectin with 
structural homology to mannose-binding lectin but with a predicted carbohydrate 
specificity for galactose" 

ORIGIN 1 llllqfalql Idgaepqnin cpayggvpgt pghnglpgrd grdgkdgaig pkgekgesgv 
40 61 sv/qgppgkag ppgtagel^ge rgpsgpqgsp gsesvleslk seiqqikaki atfekvssvc 

121 hfrkvgqkyy itdgwgnfd qglkscmefg gtmvsprtsa enqallklvv ssglgskkpy 
1 81 igvtdrkteg qfvdtegkql tftnwgpgqp ddykglqdcg viedtglwdd ggcgdirpim 
241 ceidik 

45 SEQ ID NO: 22 

AAF63469 mannose binding-like lectin precursor [Danio rerio] 
gij7542472|gb|AAF63469. 1 1 AF227738_1 [7542472] 

sig_peptide 1..23 • • 

mat_peptide 24,.251 /product="mannose binding-like lectin" 
50 Region 24.. 36 /region_name="N-terminal segment" 

Region 37.. 101 /region_name="collagen-like structure" 
Region 71. .74 /regjon__name="break in collagen structure" 
Region 102..132 /reglon_name="neck region" 
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Region 133..251 /region_name="carbohydrate recognition domain" /note="CRD" 
CDS 1..251 /gene="mbl" /coded_by="AF227738. 1:68.. 823" /note="collectin witii 
structural homology to mannose-binding lectin but with a predicted carbohydrate 
specificity for galactose" 

ORIGIN 1 mallklflga llllqlvlql magaadpqsl ncpayagvpg tpghngtpgr dgrvgrdgan 

61 gpkgekgepg vnvqgppgka gppgpagakg ergpsglpgq dcmsdslkse Iqklsdkial 
121 iekwnfktf kkvgqkyyvt ddveetfdkg mqycssngga Ivlprtleen alikvfvssa 
181 fkrifiritd rekegefvdt drkkltftnw gpnqpdnykg aqdcgaiads glwddvscds 
241 lypiiceiei k 

SEQ ID NO: 23 

AAF63468 mannose binding-like lectin precursor [Cyprinus carpio] 
gi|7542470|gb|AAF63468. 1 |AF227737_1 [7542470] 

sig_peptide 1..23 

mat_peptide 24..256 /product="mannose binding-like lectin" 
Region 24..35 /region_name="N-terminal segment" 
Region 36.. 103 /region_name="collagen-like structure" 
Region 70..73 /region_name="break in collagen structure" 
Region 104.. 134 /region_name="neck region" 

Region 135..256 /region_name="carbohydrate recognition domain" /note="CRD" 
CDS 1..256 /gene="MBL" /coded_by="AF227737.1:67..837" /note="collectin with 
structural homology to mannose-binding lectin but with a predicted carbohydrate 
specificity for galactose" 

ORIGIN 1 malfklflgt llllqfalql Idgaepqnin cpayggvpgt pghnglpgrd grdgkdgaig 

61 pkgekgesgv svqgppgkag ppgpagekge rgptgsqgsp gsesvleslk se.iqqlkaki * 
121 atfekvasvg hfrqvgqkyy itdgyvgtfd qglkfckdfg gtmvfprtsa enqallklw 
181 ssglsskkpy igvtdreteg rfvntegkql tftnwgpgqp ddykglqdcg viedsglwdd 
241 gscgdirpim ceidnk 

SEQ ID NO: 24 

AAF21 01 8 mannose-binding lectin 2 [Sus scrofa] 
gi|6644342|gb|AAF21 01 811 |AF208528_1 [6644342] 

source 1..31 /organism="Sus scrofa" /db_xref="taxoh:9823" /chromosome="1 4" 
/map="between S0007 and Sw210" Protein <1..>31 /product="mannose-binding 
lectin 2" /name="MBL2" 
CDS 1 ..31 /gene="MBL2" 

/coded_by="join(AF208528.1:<1..25,AF208528.1:703..>771)" 
ORIGIN 1 tkgekgepgp gfrgsqgppg kmgppgnige t 

SEQ ID NO: 25 . 

AAK30298 mannose-binding lectin precursor protein [Gallus gallus] 
gi|13561409|gb|AAK30298.1|[13561409] 

sig_peptide 1..21 

mat__peptide 22.. 254 /product="mannose-binding lectin protein" 
. Region 22..46 /region_name="N-terminal segment" 
Region 47..1027region_name="collagen-like" 
Region 66 /region_name="break in collagen-like structure" 
Region 103.. 139 /region_name="neck region" 

Region 140..254/region_name="carbohydrate recognition domain; CRD" 
CDS 1 ..254 /coded_by="AF231714.1:242..1006" 
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ORIGIN 1 mtllqpfsal llclslmmat sllttdkpee kmyscpiiqc sapavnglpg rdgrdgpkge 
61 kgdpgeglrg Iqglpgkagp qglkgevgpq gekgqkgerg iwtddlhrq itdleakirv 
121 leddlsrykk alslkdwnv gkkmfvstgk kynfekgksl cakagsvlas prneaental 
181 kdlidpssqa yigisdaqte grfmylsggp Itysnwkpge pnnhknedca viedsgkwnd 
' 241 Idcsnsnifi icel 

SEQ ID NO: 26 

LNMSMC mannose-binding lectin C precursor - mouse 
gi|7428747|pir||LNMSMC[7428747] 



FEATURES Location/Qualifiers source 1..244 /organism="Mus musculus" 
/db_xref="taxon:1 0090" 

Protein 1..244/product="mannose-binding lectin C precursor" /note="Ra-reactive 
factor P28a" 

15 Region 1 ..1 8 /region_name="domain" /note="signal sequence" 

-Region 1 9.. 244 /region__name="product" /note="mannose-binding lectin C" 
Bond bond(29) /bond_type="disulfide" /note-'interchain" 
Bond bond(34) /bond_type="disulfide*' /note="interchain" 
Region 38..94 /region_name="region" /note="collagen-like- 

20 Site 69 /site_type="modified" /note="4-hydroxyproline (Pro)" 

Region 124..240/reglon_name="domain" /note="C-type lectin homology #label 
LCH" 

ORIGIN 1 msiftsflll cwt\A/yaet Itegvqnscp wtcsspgin gfpgkdgrdg akgekgepgq 
61 girgiqgppg kvgptgppgn pglkgavgpk gdrgdraefd tseidseiaa Irselrairn 
25 121 wvlfslsekv gkkyfvssvk kmsldrvkal csefqgsvat prnaeensai qkvakdiayl 

181 gitdvrvegs fedltgnrvr ytnwndgepn ntgdgedcw ilgngkwndv pcsdsflaic 
241 efsd 

SEQIDNO:27 

30 LNMSMA mannose-binding lectin A precursor - mouse 
gi|625320|pir||LNMSMA[625320] 



. FEATURES Location/Qualifiers source 1..239 /organism="Mus musculus" 

35 /db_xref ="taxon: 1 0090" 

Protein 1..239 /product="mannose-binding lectin A precursor" /note="Ra-reactive 

factor P28b; serum mannan-binding protein" 

Region 1..17 /region_name="domain" /note="signal sequence" 

'Region 18..238 /region_name="producr /note="mannose~binding lectin A" 

40 Region 36..88 /region_name="region" /note="collagen-like" 

Region 119..235 /region_name="domain" /note="C-type lectin homology #label 
LCH" 

ORIGIN 1 mlllpllpvl Icvvsvsssg sqtcedtlkt csviacgrdg rdgpkgekge pgqglrglqg 

61 ppgklgppgs vgspgspgpk gqkgdhgdnr aieeklanme aeiriikski qltnklhafs. 
45 1 21 mgkksgkklf vtnhekmpfs kvkslctelq gtvaipmae enkaiqevat giaflgitde 

1 81 ategqfmyvt ggrltysnwk kdepnnhgsg edcviildng Iwndiscqas fkavcefpa 

SEQIDNOr28 

. LNRTMA mannose-binding lectin A precursor - rat giI71975|pir|lLNRTMA[71975] 

50 . 

FEATURES Location/Qualifiers source 1..238 /organism="Rattus norvegicus" . 
/db xref="taxon:10116" 
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Protein 1 ..238 /product="mannose-binding lectin A precursor" /note="serum 
mannan-binding protein" 

Region 1 . .1 7 /region_name="domain" /note="signal sequence" 
Region 18.;238 /region_name="product" /n6te="mannose-binding lectin A" 
5 Region 36.. 88 /region_name=="region" /note="collagen-like" 
Site 61 /site_type="modified" /note="4-hydroxyproline (Pro)" 
Site 67 /site_type="modified" /note="4-hydroxyproline (Pro)" 
Site 73 /site__type="modified" /note="4-liydroxyproline (Pro)" 
Site 79 /site_type="modified" /note="lysine derivative (Lys) (probably 5- 
10 hydroxylysine)" 

Site 82 /site_type="modified" /note="Iysine derivative (Lys) (probably 5- 
hydroxylysine)" 

Region 85.. 87 /region__name="region" /note="cell attachment (R-G-D) motir 
Region 118..234 /region_name="domain" /note="C-type lectin homology #label 
15 LCH" 

ORIGIN 1 mlllpllvll cwsvsssgs qtceetlktc sviacgrdgr dgpkgekgep gqglrglqgp 

61 pgklgppgsv gapgsqgpkg qkgdrgdsra ievklanmea eintlkskle Itnklhafsm 
121 gkksgkkffv tnhermpfsk vkalcselrg tvaiprnaee nkaiqevakt saflgitdev 
20 181 tegqfmyvtg grltysnwkk depndhgsge dcvtivdngi wndiscqash tavcefpa 

SEQ ID NO: 29 

LNRTMC mannose-binding lectin C precursor- rat gi|71974|pir||LNRTMC[71 974] 
FEATURES Location/Qualifiers source 1..244/organism="Rattus norvegicus" 

25 /db_xref="taxon:1 01 16" 

Protein 1 ..244 /product="mannose-binding lectin C precursor" 

Region 1..18 /region_name="domain" /note="signal sequence" 

Region 19..244 /region_name="product" /note="mannose-binding lectin C" 

Bond bond(29) /bond_type="disulfide" /note="interchain" 

30 Bond bond(34) /bond_type="disulfide" /note="interchain" 

Region 38..94 /region_name="region" 7note="collagen-like" 

Site 69 /slte_type="modified" /note="4-hydroxyproline (Pro)" 

Region 124. .240 /region_name="domain" /note="C-type lectin homology #label 

LCH" 

35 ORIGIN 1 mslftsflll cvltavyaet Itegaqsscp viacsspgin gfpgkdghdg akgekgepgq 
61 girglqgppg kvgpagppgn pgskgatgpk gdrgesvefd ttnidleiaa Irselramrk 
121 wvllsmseriv gkkyfmssvr rmplnrakal cselqgtvat prnaeenrai qnvakdvafi 
181 gitdqrtenv fedltgnrvr ytnwnegepn nvgsgencw lltngkwndv pcsdsflwc 
241 efsd 

40 • ' 

SEQ ID NO: 30 

LNHUMC mannpse-binding lectin precursor - human gi|71973|pir||LNHUMC[71973] 
FEATURES Location/Qualifiers source 1 ..248 /organism="Homo sapiens" 
/db_xref="taxon:9606" 

45 Protein 1 ..248 /product="mannose-binding lectin precursor" /note="mannan-binding 
protein" 

Region 1. . 20. /region_name="domain"7note="signal sequence" 
Region 21 ..248 /region_name="product" /note="manno^e-binding lectin" 
Region 42.. 99 /region_name="region" /note="collagen-like" 
50 Site 47 /site_type="modified"./hote="4-hydroxypro!ine (Pro) (partial)" 
Site 73 /site_type="modifie.d" /note="4-hydroxyproiine (Pro) (partial)" 
Site 79 /site_type="modified" /note="4-hydroxyproline (Pro) (partial)" 
Site 82 /site^type="modified" /note="4-hydroxyproline (Pro) (partial)" 
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Site 88 /site type="modified" /note="4-iiydroxyproline (Pro) (partial)" 

Region 1281244 /region_name="domain" /note="C-type lectin homology #iabel 

ORIGIN 1 mslfpslpil llsmvaasys etvtcedaqk tcpaviacss pgingfpgkd grdgtkgekg 
5 "^'^"^61 epgqglrglq gppgklgppg npgpsgspgp kgqkgdpgks pdgdsslaas erkalqtema 
121 rikkwltfsl gkqvgnkffi tngeimtfek vkalcvkfqa svatprnaae ngaiqnlike 
181 eaflgitdek tegqfvditg nrltytnwne gepnnagsde dcvlllkngq wndvpcstsn 
241 lavcefpi 

|^86864°complement C1s [Homo sapiens] gi|6407558ldbi|BAA86864.1l[6407558} 

FEATURES Location/Qualifiers source 1..329 /organism="Homo sapiens" 
/db_xref="taxon:9606"/tissue_type="peripheral leukocytes /clonejib- FIXli 
15 Protein 1.. 329 /product="complement CIS" 

CDS 1 329 /coded_by="join(AB009076.1 :1 142..1 146. 
AB009076.1:1703..1910,AB009076.1:2118..2295. 

AB009076 1-3495..3620,AB009076.1:4328..4527, -x * 

AB009076: :5047..5200 AB009076.1:5748..>5863)" /n<^«=7'];s,f S^ol 
total 1 2 exons. the last 4 exons of which were reported by Toshi.M. et al.(J.Moi.Biol. 

ORiSn VmwdvSlI awvyaeptmy geilspnypq aypseveksw ^ievpegygi hlyfthldie 
61 Isencaydsv qiisgdteeg ricgqrssnn phspiveefq vpynklqvif ksdfsneerf 
121 tqfaayyvat dinectdfvd vpcshfcnnf iggyfcscpp eyflhddmkn cgvncsgdvf 
181 taligeiasp nypkpypens rceyqirlek gfqwvtirr edfdveaads agncldslvf 
- 241 vagdrqfgpy cghgfpgpln ietksnaidi ifqtdltgqk kgwklryhgd pmpcpkedtp 
301 nsvwepakak yvfrdwqit cidgfewe 

SEQ ID NO: 32 . 
30 CAB56124 mannose-binding lectin [Homo sapiens] 
gi|591 1 8091emb|CAB56124.1 1[591 1809] 

FEATURES Location/Qualifiers source 1 ..248 /organism=''Homo sapiens" 
/db_xref="taxon:9606" /chromosome="10" /map="10q11.2-q21" /note="MBL 

35 haplotype HYPD" , 
Protein 1 ..248 /product="rnannose-binding lectin 

SdS 1^248 /gene="MBL- /coded_by="Y1 6582.1 :892..1638" _ 
ORIGIN 1 mslfpslpil llsmvaasys etvtcedaqk tcpaviacss pgingfpgkd Scdgtkgekg 
40 ''^"'"^61 epgqglrglq gppgklgppg npgpsgspgp kgqkgdpgks pdgdsslaas erkalqtema 
121 rikkwltfsl gkqvgnkffi tngeimtfek vkalcvkfqa svatprnaae ngaiqnlike 
1 81 eaflgitdek tegqfvditg nrltytnwne gepnnagsde dcvlllkngq wndvpcstsn 
241 lavcefpi 

45 SEQ ID NO: 33 . " 

CAB56123 mannose-binding lectin [Homo sapiens] 
' gil591 1 807|emb[CAB56123.1 1[591 1 807] 

FEATURES Location/Qualifiers source 1 ..248 /organism="Homo sapiens" 
50 /db_xref="taxon:9606" /chromosome="10" /map="10q11.2-q21" /note= MBL 
haplotype HYPA" 

Protein 1.. 248 /product="mannose-binding lectin 
sig_peptide 1..20 



20 



25 
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CDS 1..248 /gene="MBL" /coded_by="Y16581. 1:892.. 1638" 

ORIGIN 1 mslfpslpll llsmvaasys etvtcedaqk tcpaviacss pgingfpgkd grdgtkgekg 

61 epgqglrglq gppgklgppg npgpsgspgp kgqkgdpgks pdgdsslaas erkalqtema 
121 rikkwltfsl gkqvgnkffi tngeimtfek vkalcvkfqa svatprnaae ngaiqniike 
5 181 eaflgitdek tegqfvditg nrltytnwne gepnnagsde dcvlllkngq wndvpcstsh 

241 lavcefpi 

SEQ ID NO: 34 

CAB56122 mannose-binding lectin [Homo sapiens] 
10 gi|591 1798|emb|CAB561 22.1 1[591 1798] 

FEATURES Location/Qualifiers source 1..248 /organism="Homo sapiens" 
/db_xref="taxon:9606" /chromosome="10" /map="10q11.2-q2r' /note="MBL 
haplotype LXPA" 
15 Protein 1..248 /product="mannose-binding lectin" 
sig_peptide 1.:20 

CDS 1..248 /gene="MBL" /coded_by="Y1 6580. 1:892.. 1638" 

ORIGIN 1 mslfpslpll llsmvaasys etvtcedaqk tcpaviacss pgingfpgkd grdgtkgekg 

61 epgqglrglq gppgklgppg npgpsgspgp kgqkgdpgks pdgdsslaas erkalqtema 
20 121 rikkwltfsl gkqvgnkffi tngeimtfek vkalcvkfqa svatprnaae ngaiqniike 

181 eaflgitdek tegqfvditg nrltytnwne gepnnagsde dcvlllkngq wndvpcstsh 
241 lavcefpi 

SEQ ID NO: 35 
25 CAB56121 mannose-binding lectin [Homo sapiens] 
gi|591 1 796|emb|CAB56121 . 1 1[591 1 796] 

FEATURES Location/Qualifiers source 1..248 /organism="Homo sapiens" 
/db_xref="taxon:9606" /chromosdme="10" /map="10q11.2-q21" /note="MBL 
30 haplotype LYPB" 

Protein 1..248 /product-'mannose-binding lectin" 
sig_peptide 1..20 

CDS 1..248 /gene="MBL" /coded_by="Y1 6579. 1:892.. 1638" 
ORIGIN 1 mslfpslpll llsrnvaasys etvtcedaqk tcpaviacss pgingfpgkd grddtkgekg 
35 61 epgqglrglq gppgklgppg npgpsgspgp kgqkgdpgks pdgdsslaas erkalqtema 

121 rikkwltfsl gkqvgnkffi tngeimtfek vkalcvkfqa svatprnaae ngaiqniike 
181 eaflgitdek tegqfvditg nrltytnwne gepnnagsde dcvlllkngq wndvpcstsh 
241 lavcefpi 

40 SEQ ID NO: 36 

CAB56045 mannose-binding lectin [Homo sapiens] 
gi|591 1 794Iemb|CAB56045. 1 1[591 1 794] 

/organism="Homo sapiens" /db_xref="taxon:9606" /chromosome="10" 
45 /map="10q1 1.2-q21" /note="MBL haplotype LYQC" 
Protein 1.. 248 /product="mannose-binding lectin" 
sig_j}eptide 1..20 
. CDS 1 ..248 /gene="MBL" /coded_by="Y1 6578. 1 :886.. 1632" 
ORIGIN 1 mslfpslpll llsmvaasys etvtcedaqk tcpaviacss pgingfpgkd grdgtkeekg 
50 61 epgqglrglq gppgklgppg npgpsgspgp kgqkgdpgks pdgdsslaas erkalqtema 

121 rikkwltfsl gkqvgnkffi tngeimtfek vkalcvkfqa svatprnaae ngaiqniike 
181 eaflgitdek tegqfvditg nrltytnwne gepnnagsde dcvlllkngq wndvpcstsh 
241 lavcefpi 
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SEQ ID NO: 37 

CAB56120 mannose-binding lectin [Homo sapiens] 
gil5911792lemb|CAB56120.1II5911792] . 
5 FEATURES Location/Qualifiers source 1 ..248 /organism="Homo sapiens 

/db_xref="taxon:9606" /chromosome="10" /map="10q11.2-q2l" /note= MBL 
haplotype LYPA" 

Protein 1 ..248 /product="mannose-binding lectin 

sig peptide 1..20 

10 CDS1..248/gene="iy/IBL"/coded_by="Y16577.1:892..1638" , ^ ^ ^ 
ORIGIN 1 mslfpslpll llsmvaasys etvtcedaqk tcpaviacss pgingfpgkd grdgtl<gekg 

61 epgqglrglq gppgklgppg npgpsgspgp kgqkgdpgks pdgdssiaas erkalqtema 
:121 rikkwitfsl gkqvgnkffl tngeimtfek vkalcvkfqa svatprnaae ngaiqniike 
1 81 eaflgitdek tegqfvdltg nrltytnwne gepnnagsde dcvlllkngq wndvpcstsh 
15 241 lavcefpl 

SEQ ID NO: 38 . . 

CAB56044 mannose-binding lectin [Homo sapiens] 
gi|5911790|emblCAB56044.1|[5911790l . 
20 FEATURES Location/Qualifiers source 1..248 /organism= Homo sapiens 

/db_xref="taxon:9606" /chromosome="10" /map="10q11.2-q21" /note= MBL 

haplotype LYQA" ... • 

Protein 1..248 /product="mannose-binding lectin 

siq peptide 1..20 

25 CDS1..248/gene="MBL"/coded_by="Y16576.1:886..1632" 

ORIGIN 1 mslfpslpll llsmvaasys etvtcedaqk tcpaviacss pgingfpgkd grdgtkgekg 

61 epgqglrglq gppgklgppg npgpsgspgp kgqkgdpgks pdgdssiaas erkalqtema 
121 rikkwitfsl gkqvgnkffl tngeimtfek vkalcvkfqa svatprnaae ngaiqniike 
1 81 eaflgitdek tegqfvdltg nrltytnwne gepnnagsde dcvlllkngq wndvpcstsh 
30 241 lavcefpi 

AABs'silO^CIqRCp) [Homo sapiens] gi|2052498|gb|AAB53110.1|[2052498] 

35 FEATURES Location/Qualifiers source 1 .;652 /organism="Homo sapiens" 

/db xref="taxon:9606" /cellJine="U937 histiocytic cell line" ^ . ^ 

Protein 1 652/product="C1qR(p)" /function="mediates enhanced phagocytosis by 
human monocytes and macrophages in response to complement C1q. mannose 
binding lectFn (MBL) and pulmonary surfactant protein A (SPA)" 

40 CDS 1 652 /coded_by="U94333.1:149..2107" /note="Clq/MBL/SPA receptoi* 
ORIGIN 1 matsmgllll llllltqpga gtgadteavv cvgtacytah sgklsaaeaq nhcnqnggnl 
61 atvkskeeaq hvqrvlaqll rreaaltarm skfwiglqre kgkcldpslp Ikgfswvggg 
121 edtpysnwhk eirnsciskr cvsllldlsq pllpnrlpkw segpcgspgs pgsniegfvc 
■ 1 81 kfsfkgmcrp lalggpgqvt yttpfqttss sleavpfasa arivacgegdk detqshyfic 

45 241 kekapdvfdw gssgplcvsp kygcnfnngg chqdcfeggd gsflcgcrpg frilddlvtc 

301 asmpcsssp crggatcvlg phgknytcrc pqgyqidssq Idcvdvdecq dspcaqecvn . 
361 tpggfrcecw vgyepggpge gacqdvdeca Igrspcaqgc tntdgsfhcs ceegyvlage 
421 dgtqcqdvde cvgpggplcd sicfntqgsf hcgclpgwvl apngvsctmg pvslgppsgp 
481 pdeedkgeke gstvpraata sptrgpegtp katpttsrps Issdapitsa piknalapsgs 

•50 541 sgvwrepsih hataasgpqe paggdssvat qnndgtdgqk lllfyilgtv vaillllala 
. 601 Iglivyrkn-akreekkekk pqnaadsysw vperaesram enqysptpgt dc 

SEQ ID NO: 40 
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NP_571 645 mannose binding-like lectin [Danio rerio] 
gi|18858997|ref|NP_571645.1|[18858997] 

sig_j)eptide 1..23 

mat_peptide 24..251. /product="nnannose binding-like lectin" 
. Region 24..36 /region_name="N-temninal segment" 
Region 33..70 /region_name="Collagen triple helix repeat (20 copies)" 
/note="Collagen" /db_xref="CDD:pfam01 391 " 

Region 33..70 /region_nanne="Collagen triple helix repeat (20 copies)" 

/note="Collagen" /db_xref="CDD:pfam01 391 " 

Region 37.. 101 /region_name="collagen-like structure" 

Region 37..70 /region_name="Collagen triple helix repeat (20 copies)" 

/note="Collagen" /db_xref="CDD:pfam01 391 " 

Region 71 ..74 /region_name="break in collagen structure" 

Region 1 02..1 32 /region_nanne="neck region" 

Region 133..251 /region_nanrie="carbohydrate recognition domain" /note="CRD" 

Region 134.. 247 /region_name="C-type lectin (CTL) or carbohydrate-recognition 

domain (CRD)" /note="CLECT" /db_xref="CDD:smart00034" 

Region 146..247/region_name="Lectin C-type domain" /note="lectin_c" 

/db_xref="CDD:pfam00059" 

CDS 1..251 /gene="mbr/coded_by="NM_1 31 570.1 :68..823"/note="collectin with 
structural homology to mannose-binding lectin but with a predicted carbohydrate 
specificity for galactose;mannose binding-like lectin" /db_xref="LocuslD:58091 " • 
ORIGIN 1 mallklflga llllqlvlql magaadpqsl n'cpayagvpg tpghnglpgr dgrvgrdgan 

61 gpkgekgepg vnvqgppgka gppgpagakg ergpsglpgq dcmsdslkse Iqklsdkial 
121 iekwnfktf kkvgqkyyvt ddveetfdkg mqycssngga Ivlprtleen allkvfvssa 
181 fkrifiritd rekegefvdt drkkltftnw gpnqpdhykg aqdcgaiads glwddvscds 
241 lypiiceiei k 

SEQIDNO:41 

BAA90338 mannose-binding lectin-associated serine protease (MASP) related 
protein [Cyprinus carpio] gi|6807499|dbj|BAA90338.1|[6807499] 
FEATURES Location/Qualifiers source 1..118/organism="Cyprinus carpio" 

/db_xref="taxon:7962" 

Protein 1 ..1 18 /product="mannose-binding lectin-associated serine protease 
(MASP) related protein" 
CDS 1 . . 1 1 8 /gene="MRPb" 

/coded_by="join(AB030447.1 :<1 ..96,AB030447.1 :201 ..319. 
AB030447. 1 :436..514,AB030447.1 :616..680)" /note="MASP-related pi-otein" 
ORIGIN 1 kiqtgsntvs ilfhsdnsgd nigwkltyts tgsecsplaa pinghleplq snyifkdhim 
61 Itcdpgysir qgdkefehyq iecqrdgkws sdvplckkka sqrrhrslps iltnqils 



The second polypeptide preferably comprises at least 10, such as at least 12, for 
example at least 15. such as at least 20. for example at least 25. such as at least 
30, for example at least 35. such as at least 40, for example at least 50 consecutive 
arhino acid residues of the collectin or of a variant or a homolqgue to said protein. 
Such a variant or homologue is preferably at least 70%, such as 80%. for example 
90%, such as 95% identical to the collectin. 
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In a preferred embodiment the second polypeptide sequence comprises the CRD 
domain of MBL or the neck region of MBL or the collagen-iike domain of MBL. More 
preferably the second polypeptide comprises the neck region and the CRD domain 
5 of MBL. In a most preferred embodiment the second polypeptide sequence com- 
prises the collagen-like domain, the neck region and the CRD domain of MBL. MBL 
is as defined above. 

Preferably the second polypeptide sequence comprises at least amino acids 170- 
10 200 of the MBL sequence shown in Figure 2. such as at least amino acids 160-200 
of the MBL sequence shown in Figure 2, such as at least amino acids 150-200 of 
the MBL sequence shown in Figure 2, such as at least amino acids 140-200 of the 
MBL sequence shown in Figure 2, such as at least amino acids 130-200 of the MBL 
sequence shown in Figure 2, such as at least amino acids 120-200 of the MBL se- 
15 quence shown in Figure 2, such as at least amino acids 1 10-200 of the MBL se- 
quence shown in Figure 2, such as at least amino acids 1 00-200 of the MBL se- 
quence shown in Figure 2, such as at least amino acids 90-200 of the MBL se- 
quence shown in Figure 2, such as at least amino acids 80-200 of the MBL se- 
quence shown in Figure 2, such as at least amino acids 70-200 of the MBL se- 
20 quence shown in Figure 2, such as at least amino acids 60-200 of the MBL se- 
quence shown in Figure 2, such as at least amino acids 80-228 of the MBL se- 
quence shown in Figure 2. 

Preferably the second polypeptide sequence comprises amino acids 80-228 of SEQ 
25 ID. NO 2. 

In a preferred embodiment the second polypeptide sequence is capable of associ- 
ating with at least one MASP protein, such as a MASP protein selected from the 
group consisting of MASP-1, MASP-2 and MASP-3 or functional homologues or 

30 variants hereof. In particular the second polypeptide is capable of associating with 
said at least one MASP protein when being part of the fusion protein. Thereby the 
second polypeptide sequence is capable of providing the fusion protein with com- 
plement system activating activity. In a preferred embodiment the second polypep- 
tide sequence comprises an amino acid sequence selected from: 56-228 of SEQ ID. 

35 NO 2, 55-228 of SEQ ID. NO 2, 54-228 of SEQ ID. NO 2, and 50-228 of SEQ ID. 
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NO 2. In a preferred embodiment the second polypeptide sequence has an amino 
acid sequence selected from: 56-228 of SEQ ID. NO 2, 55-228 of SEQ ID. NO 2, 
54-228 of SEQ ID. NO 2, and 50-228 of SEQ ID. NO. 2. 

5 In another embodiment the second polypeptide comprises the cysteine-rich region 
of the collectin, such as the N-terminal region of the collectin. 

Fusion protein 

10 The fusion protein comprises the first and the second polypeptide connected to each 
other, optionally through a linker region. In a preferred embodiment the first poly- 
peptide sequence is positioned N-terminally in the fusion protein and the second 
polypeptide sequence is positioned C-terminally. 

15 Specific examples of the components of the fusion protein are: 

- A fusion protein comprising the cysteine-rich region and the collagen-like domain 
of L-f icolin and the CRD domain of iVIBL 

. 20 - A fusion protein comprising the cysteine-rich region of L-ficolin and the collagen- 
like domain, the neck region and the CRD domain of MBL. 

- A fusion protein comprising the cysteine-rich region and the collagen-like domain 
of H-ficolin and the CRD domain of MBL 

25 . . 

- A fusion protein comprising the cysteine-rich region of H-ficolin and the collagen- 
like domain, the neck region and the CRD domain of MBL. 

A fusion protein comprising the cysteine-rich region and the collagen-like domain 
30 of M-ficolin and the CRD domain of MBL 

- A fusion protein comprising the cysteine-rich region of M-ficolin and the collagen- 
like domain, the neck region and the. CRD domain of MBL. 
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- A fusion protein comprising the cysteine-ricti region of IVIBL, and the CRD domain 
officolin. 



- A fusion protein comprising the cysteine-rich region of MBL and the collagen-lil<e 
5 domain, the neck region and the CRD domain of ficolin. 

- A fusion protein comprising the cysteine-rich region and the collagen-lil<e domain 
of L-ficolin and the CRD domain of Pulmonary surfactant-associated protein D. 

10 - A fusion protein comprising the cysteine-rich region of L-ficolin and the collagen- 
iil<e domain, the neck region and the CRD domain of Pulmonary surfactant- 
associated protein D. 

- A fusion protein comprising the cysteine-rich region and the collagen-like domain 
15 of a ficolin and the CRD domain of a collectin-43. 

- A fusion protein comprising the cysteine-rich region of a ficolin and the collagen- 
like domain, the neck region and the CRD domain of a collectin-43. 

20 A fusion protein comprising the amino acid sequence as defined by the sequence 
shown in Figure 3, or a functional homologue thereof, preferably a fusion protein 
consisting of the amino acid sequence as shown in Figure 3. In another embodiment 
the fusion protein has amino acid sequence 1-50 of the amino acid shown in Figure 
1 and amino acid sequence 54-228 of the amino acid sequence shown in Figure 2. 

25 

As discussed above the.fusioh protein is preferably capable of forming subunit com- 
plexes as well as oligomers of subunit complexes. Preferably the fusion protein 
forms substantially only trimeric, tetrameric, pentameric and hexameric subunit oli- 
gomers, such as trimeric, tetrameric. and pentameric subunit oligomers, such as 
30 trimeric or tetrameric subunit oligomers, more preferably substantially only tet- 
rameric subunit oligomers, in order to obtain a more homogenous composition of 
fusion proteins. 
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Homoloques 

• 5 In the present context the terms homologue or variant or functional homologues are 
used as synonymes, wherein a homologue of a protein exhibits one or more substi- 
tuions, deletions, and/or additions of one or more amino acid residues. Fragments 
are a subgroup of honiologues being truncations of the protein. 

.10 A homologue of the protein may comprise one or more conservative amino acid 

substitutions, such as at least 2 conservative amino acid substitutions, for example 
at least 3 conservative amino acid substitutions, such as at least 5 conservative 
amino acid substitutions, for example at least 10 conservative amino acid substitu- 
tions, such as at least 20 conservative amino acid substitutions, for example at least 

15 50 conservative amino acid substitutions such as at least 75 conservative amino 
acid substitutions, for example at least 100 conservative amino acid substitutions. 
Conservative amino acid substitutions within the meaning of the present invention is 
substitution of one amino acid within a predetermined group of amino acids for an- 
other amino acid within the same predietermined group, exhibiting similar or sub- 

20 stantially similar characteristics. Such predetermined groups are for example: 

polar side chains (Asp, Glu, Lys, Arg, His, Asn, Gin, Ser, Thr, Tyr. and Cys,) 
non-polar side chains (Gly, Ala, Val, Leu, lie, Phe, Trp, Pro, and Met) 

25 

aliphatic side chains (Gly, Ala Val, Leu, lie) 

cyclic side chains (Phe, Tyr, Trp, His, Pro) 

30 aromatic side chains (Phe, Tyr, Trp) 

acidic side chains (Asp, Glu) 

basic side chains (Lys, Arg, His) 
35 . 
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hydroxy side chains (Ser, Thr) 

5 sulphor-containing side chains (Cys, Met), and 

amino acids being monoamino-dicarboxylic acids or monoamino-monocarboxylic- 
monoamidocarboxyiic acids (Asp, Glu, Asn, Gin). 

10 Conservative substitutions may be introduced in any position of a preferred protein. 

It may however also be desirable to. introduce non-conservative substitutions; A non- 
conservative substitution should lead to the formation of a homologue of a protein 
capable of exerting a function similar to the function of said protein. Such substitu- 
tion could for example i) differ substantially in hydrophobicity, for example a hydro- 

15 phobic residue (Val, He, Leu, Phe or Met) substituted for a hydrophilic residue such 
as Arg, Lys, Trp or Asn, or a hydrophilic residue such as Thr, Ser, His, Gin, Asn, 
Lys. Asp, Glu or Trp substituted for a hydrophobic residue; and/or ii) differ substan- 
tially in its effect on polypeptide backbone orientation such as substitution of or for 
Pro or Gly by another residue; and/or iii) differ substantially in electric charge, for 

20 example substitution of a negatively charged residue such as Glu or Asp for a posi- 
tively charged residue such as Lys, His or Arg (and vice versa); and/or iv) differ sub- 
stantially in steric bulk, for example substitution of a bulky residue such as His, Trp. 
Phe or Tyr for one having a minor side chain, e.g. Ala, Gly or Ser (and vice versa). 

25 In a further embodiment the present invention relates to homologues of a preferred 
protein, wherein such homologues comprise substituted amino acids having hydro- 
philic or hydropathic indices that are within +/-2.5, for example within +/- 2.3, such 
as within +/- 2.1 , for example within +/- 2.0, such as within +/- 1 .8, for example 
within +/- 1.6, such as within +/- 1.5, for example within +/- 1 .4, such as within +/- 

30 1 .3 for example within +/- 1 .2, such as within +/- 1 .1 , for example within +/- 1 .0, such 
as within +/- 0.9, for example within +/- 0.8, such as within +/- 0.7, for example 
within +/- 0.6, such as within +/- 0.5, for example within +/- 0.4, such as within +/- 
0.3, for example within +/- 0.25, such as within +/- 0.2 of the value of the amino acid 
it has substituted. 

35 
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The importance of the hydrophilic and hydropathic amino acid indices in conferring 
interactive biologic function on a protein is well understood in the art (Kyte & Doolit- 
tle, 1982 and Hopp, U.S. Pat Mo. 4.554,101, each incorporated herein by refer- 
ence). 

5 

The amino acid hydropathic index values as used herein are: isoleucine (+4.5); va- 
line (+4.2); leucine (+3.8); phenylalanine (+2.8); cysteine/cystine (+2.5); methionine 
(+1.9); alanine (+1.8); glycine (-0.4 ); threonine (-0.7 ); serine (-0.8 ); tryptophan (- 
0.9); tyrosine (-1.3); proline (-1.6); histidine (-3.2); glutamate (-3.5); glutamine (-3.5); 
10 aspartate (-3.5); asparagine (-3.5); lysine (-3.9); and arginine (-4.5) (Kyte & Doolittle, 
1982). 

The amino acid hydrophilicity values are: arginine (+3.0); lysine (+3.0); aspartate 
(+3.0.+-.1); glutamate (+3.0.+-.1); serine (+0.3); asparagine (+0.2); glutamine (+0.2); 
15 glycine (0); threonine (-0.4); proline (-0.5.+-.1); alanine (-0.5); histidine (-0.5); cys- 
teine (-1 .0); methionine (-1 .3); valine (-1 .5); leucine (-1 .8); isoleucine (-1 .8); tyrosine 
(-2.3); phenylalanine (-2.5); tryptophan (-3.4) (U.S. 4.554.101). 

Substitution of amino acids can therefore in one embodiment be made based upon 
20 their hydrophobicity and hydrophilicity values and the relative similarity of the amino 
acid side-chain substituents, including charge, size, and the like. Exemplary amino 
acid substitutions which take various of the foregoing characteristics into considera- * 
tion are well known to those of skill in the art and include: arginine and lysine; glu- 
tamate and aspartate; serine and threonine; glutamine and asparagine; and valine, 
25 leucine and isoleucine. 

Furthermore, a homologue may comprise addition or deletion of an aminp acid, for 
example an addition or deletion of from 2 to 100 amino acids, such as from 2 to 50 
amino acids, for example from 2 to 20 amino acids, such as from 2 to 10 amino ac- 
30 ids, for example from 2 to 5 amino acids, such as from 2 to 3 amino acids. However, 
additions of more than 100 amino acids, such as additions from 100 to 500 amino 
acids, are also comprised within the present invention. 

Proteins sharing at least some homology with a preferred protein are to be consid-. 
35 ered as falling within the scope of the present invention when they are at least about 
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40 percent homologous, or preferably, identical, with the preferred protein, such as at 
least about 50 percent homologous, or preferably identical, for example at least 
about 60 percent homologous, or preferably identical, such as at least about 70. per- 
cent homologous, or preferably identical, for example at least about 75 percent ho- 

. 5 mologous, or preferably identical, such as at least about 80 percent homologous, or 
preferably identical, for example at least about 85 percent homologous, or preferably 
identical, such as at least about 90 percent homologous, or preferably identical, for 
example at least 92 percent homologous, or preferably identical, such as at least 94 
percent homologous, or preferably identical, for example at least 95 percent ho- 

10 mologous, or preferably identical, such as at least 96 percent homologous, or pref- 
erably identical, for example at least 97 percent homologous, s or preferably identi- 
cal, uch as at least 98 percent homologous, or preferably identical, for example at 
least 99 percent homologous, or preferably identical, with the prefen-ed protein. 

15 Preferred proteins are complement activating proteins comprising collectins and 
lectins and homologues hereof. 

Homoiogues of collectins 

20 A homologue of a collectin including MBL within the scope of the present invention 
should be understood as any protein capable of exerting a function similar to the 
. function of a collectin and comprising one or more of the variations described above. 
In particular such function is the ability to activate complement upon binding to one 
or more carbohydrates. 

25 

The terms functional homologues of.collectin used herein relate to functional 
equivalents or a fragment of collectin comprising a predetermined amino acid se- 
quence, and such homologues are defined as: 

30 a) A homologue comprising an amino acid sequence capable of recognising and 
binding to glucans, lipophosphoglycans and glycoinositol phospholipids that 
contain sugar with 3- and 4-hydroxyl groups in the pyranose ring (i.e. Man, G|c, 
Fuc or GicNAc) either alone or when being subunit complexed as described 
above and/or 

35 
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b) A homologue comprising an amino acid sequence capable of fonning an asso- 
ciation with a component of the Lectin/iVIBL pathway such as binding to the 
MASP-1 , I\/IASP-2, MASP-3 and/or sMAP ather alone or when being subunit 
complexed as described above, wherein said binding result in activation of the 

■ 5 Lectin/MBL pathway and/or 

c) A homologue comprising an amino add sequence capable of by the collagen- 
like domain forming an ollgomeric stmcture of two or more subunits, where a 
subunit comprises three identical polypeptides of a cystelne-rich region, a colla- 

1 0 gen-IIke domain, a neck region and a carbohydrate recognition domain. 

Homoloques of lectins 

A homologue of a lectin including ficolins within the scope of the present invention 
1 5 should be understood as any protein capable of exerting a function similar to the 
function of a lectin and comprising one or more of the variations previously de- 
scribed. In particular such function is the ability to activate complement upon binding 
to one or more carbohydrates. 



20 The terms functional homologues of lectin used herein relate to functional equiva- 
lents of a fragment of lectin comprising a predetermined amino acid sequence, and 
such homologues are defined as: 

a) A homologue comprising an amino acid sequence capable of recognising and 
25 binding to N-acetyl-glucosamine (GlcNAc), or N-acetyl-galactosamine (GalNAc), 

or elastin either alone or when being subunit complexed as described above 
and/or -^^ . 

b) A homologue comprising an amino acid sequence capable of forming an asso- 
30 elation with a component of the Lectin/MBL pathway such as binding to the 

MASP-1 , MASP-2, MASP-3 and/or sMAP either alone or when being subunit 
complexed as described above, wherein said binding result in activation of the 
Lectin/MBL pathway and/or 
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c) A homologue comprising an amino acid sequence capable 6f by the collagen- 
like domain forming an oligomeric structure of two or more subunits. where a 
subunit comprises three identical polypeptides of a cysteine-rich region, a colla- 
gen-like domain, a n^ region and a fibrinogen-like domain. • 

The activation of the lectin/MBL pathway. I.e. the activity of the fusion protein to acti- 
vate the complement system may be assessed by assessing the C4 cleaving effect 
of the fusion protein or subunit complexes or oligomers of complexes thereof by the 
following method comprising the steps of 

- applying a sample comprising a predetemnined amount of fusion protein as well 
as a predetemiined amount of MASP-1 , IVIASP-2 or MASP-3. 

- applying at least one complement factor to the sample, 

- detecting the amount of cleaved complement factors, 

- con-elating the amount of cleaved complement factors to the amountof fusion 
protein , and 

- determining the activity of the fusion protein. 



The complement factor preferably. used in the present method is a complement fac- 
25 tor cleavable by the MBL/MASP-2 complex, such as C4. However, the complement 
factor may also be selected from C3 and C5. 

. The cleaved complement factor may be detected by a variety of means, such as by 
of antibodies directed to the cleaved complement factor. 



30 



The assay is carried out at conditions which minimize or eliminate interference from 
the classical complement activation pathway and the alternative complement active- 
tion pathway. 
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Preferably a homologue of a collectin and/or a lectin exhibits two of the functions 
defined above, more preferably three of the functions defined above. 

Preparation of fusion protein 

The fusion protein may be prepared by any suitable method known to the person 
skilled in the art. Below are described several of the methods for preparing the fu- 
sion protein, however the invention is not limited to those methods. 

Synthetic preparation 

When appropriate, in particular in relation to the size of the fusion protein, the fusion 
protein may be produced synthetically. The methods for synthetic production df pepr 
tides are well known in art. Detailed descriptions as well as practical advice for pro- 
ducing synthetic peptides may be found in Synthetic Peptides: A User's Guide (Ad- 
vances in Molecular Biology), Grant G. A. ed,, Oxford University Press, 2002, or in: 
Pharmaceutical Formulation: Development of Peptides and Proteins, Frokjaer and 
Hovgaard eds., Taylor and Francis, 1999. 

Recombinant preparation 



The fusion proteins of the invention are preferably produced by use of recombinant 
DNA technologies. The.DNA sequence encoding each part of the fusion protein may 
be prepared by fragmentation of the DNA sequences encoding the full-length pro- 
tein, (genomic DNA or cDNA) which the fusion protein part is derived from, using 
DNAase I according to ia standard protocol (Sambrook et al., Molecular cloning: A 
Laboratory manual. 2 rd ed., CSHL Press, Cold Spring Harbor, NY, 1989)..The ob- 
tained DNA sequences encoding the individual parts of the fusion protein may then be 
fused together. 

The DNA sequence may also be prepared by polymerase chain reaction using spe- 
cific primers, for instance as described in US 4.683,202 or Saiki et al., 1988, Sci- 
ence 239:487-491. 
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The DNA sequence encoding a fusion protein of the invention may be prepared 
synthetically by established standard methods, e.g. the phosphoamidine method 
described by Beaucage and Camthers. 1981. Tetrahedron Lett. 22:1859-1869. or 
the method described by Matthes et a!.. 1984. EMBO J. 3:801-805. According to the 
5 . phosphoamidine method, oligonucleotides are synthesized, e.g. in an automatic 
DNA synthesizer, purified, annealed, ligated and cloned in suitable vectors. 

The DNA sequence is then inserted into a recombinant expression vector, which 
may be any vector, which may conveniently be subjected to recombinant DNA pro- 

10 C6dures. The choice of vector will often depend on the host cell into which it is to be 
introduced. Thus, the vector may be an autonomously replicating vector, i.e. a vec- 
tor that exists as an exlrachromosomal entity, the replication of which is independent 
of chromosomal replication, e.g. a plasmid. Alternatively, the vector may be one 
which, when introduced into a host cell, is integrated into the host cell genome and 

15 replicated together with the chromosome(s) into which it has been integrated. 

in the vector, the DNA sequence encoding a fusion protein should be operably con- 
nected to a suitable.promoter sequence. The promoter may be any DNA sequence, 
which shows transcriptional activity in the host cell of choice and may be derived 
20 from genes encoding proteins either homologous or heterologous to the host cell. 
Examples of suitable promoters for directing the transcription of the coding DNA 
sequence in mammalian cells are the SV 40 promoter (Subramani et al.. 1 981 . Mol. 
Cell Biol. 1:854-864), the MT-1 (metallothionein gene) promoter (Palmiter et ai.. 
1983, Science 222: 809-814) or the adenovirus 2 major late promoter. A suitable 
25 promoter for use in insect ceils is the polyhedrin promoter (Vasuvedan et al., 1992. 

FEBS Lett. 311:7-11). Suitable promoters for use in yeast host cells include promot- 
ers from yeast glycolytic genes (Hitzeman et al., 1980. J. Biol. Chem. 255:12073- 
12080: Alber and Kawasaki. 1982, J. Mol. Appl. Gen. 1: 419-434) or alcohol dehy- 
drogenase genes (Young et al., 1982. in Genetic Engineering of Microorganisms for 
30 Chemicals. Hollaender et al, eds.. Plenum Press. New York), or the TPI1 (US 

4.599,31 1) or ADH2-4C (Russell et al.. 1983. Nature 304:652-654) promoters. Suit^ 
able promoters for use in filament6us fungus host cells are. for instance, the ADH3 
promoter (McKnight et al.. 1985. EMBO J. 4:2093-2099) or the tpiA promoter. 
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The coding DNA sequence may also be operably connected to a suitable temninator. 
such as the human growth hormone tenninator (Palmiter et al., op. cit.) or (for fungal 
hosts) the tPI1 (Alber and Kawasaki, op. cit.) or ADH3 (McKnight et al.. op. cit.) 
promoters. The vector may further comprise elements such as polyadenylation slg- 
5 nals (e.g. from SV 40 or the adenovirus 5 Elb region), transcriptional enhancer se- 
quences (e.g. the SV 40 enhancer) and translational enhancer sequences (e.g. the 
ones encoding adenovirus VA RNAs). 

The recombinant expression vector may further comprise a DNA sequence enabling 
10 the vectpr to replicate in the host cell in question. An example of such a sequence 

(when the host cell is a mammalian cell) Is the SV 40 origin of replication. The vector 
may also comprise a selectable marker, e.g. a gene the product of which comple- 
ments a defect in the host cell, such as the gene coding for dlhydrofolate reductase 
(DHFR) or one which confers resistance to a drug, e.g. neomycin, hydromycin or 
15 methotrexate. 

The procedures used to ligate the DNA sequences coding the fusion proteins, the 
promoter and the terminator, respectively, and to insert them into suitable vectors 
containing the infomiation necessary for replication, are well known to persons 
20 skilled in the art (cf., for instance. Sambrook et al.. op.cit). 

• The synthesis of the recombinant fusion protein may be by use of in vitro or in vivo 
cultures. The host ceil culture is preferably an eucaryotic host cell culture. By trans-, 
formation of an eukaryotic cell culture is in this context meant introduction of recom- 

25 binant DNA into the cells. The expression construct used in the process is charac- 
terised by having the encoding region selected from mammalian genes including 
human genes and genes with big resemblance herewith such as the genes from the 
chimpanzee. The expression constmct used is furthermore featured by the promoter 
region being selected from genes of virus or eukaryotes, including mammalian cells 

30. and cells from insects. 

The process for producing recombinant MBL according to the invention is charac- 
terised in that the host cell culture is preferably eukaryotic, and for example a mam- 
malian cell culture; A prefen-ed host cell culture is a culture of human kidney cells 
35 and in an even more preferred form the host cell culture is a culture of human em- 
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bryonal kidney cells (HEK cells), such as HEK 293 cell lines for production of re- 
combinant human MBL. By "HEK 293 cell lines" is meant any cell line derived from 
human embryonal kidney tissue such as, but not ilmited to, the cell lines deposited 
at the American Type Culture Collection with the numbers CRL-1573 and CRL- 
5 10852. 

Other cells may be chick embryo fibroblast, hamster ovary cells, baby hamster kid- 
ney cells, human cervical carcinoma cells, human melanoma ceils, human kidney 
cells,- human umbilical vascular endothelium cells, human brain endothelium cells. 

10 human oral cavity tumor cells, monkey kidney cells, mouse fibroblast, mouse kidney 
cells, mouse connective tissue cells, mouse oligodendritic cells, mouse macro- 
phage, mouse fibroblast, mouse neuroblastoma cells, mouse pre-B cell, mouse B 
lymphoma cells, mouse plasmacytoma cells, mouse teratocacinoma ceils, rat astro- 
cytoma cells, rat mammary epithelium cells. COS, CHO, BHK, VERO, HeLa, MOCK, 

15 WI38, and NIH 3T3 cells. 

Alternatively, fungal cells (including yeast cells) may be used as host cells. Exam- 
ples of suitable yeast cells include cells of Saccharomyces spp. or Schizosaccharo- 
myces spp., in particular strains of Saccharomyces cerevisiae. Examples of other 
■ 20 fungal cells are cells of filamentous fungi, e.g. Aspergillus spp. or Neurospora spp., 
in particular strains o1 Aspergillus oryzae or Aspergillus niger. The use of Aspergillus 
spp. for the expression of proteins is described in, e.g., EP 238 023. 

In addition, a host cell strain may be chosen which modulates the expression of the 
25 inserted sequences, or modifies and processes the gene product in the specific 

fashion desired. Such modifications (for example, glycosylation) and processing (for 
example, cleavage) of protein products may be important for the function of the 
protein. Different host cells have characteristic and specific mechanisms for the 
ppst-translational processing and modification of proteins and gene products.. Ap- 
30 propriate cell lines or host systems can be chosen to ensure the correct modification 
and processing of the foreign protein expressed. To this end, eukaryotic host cells 
which possess the cellular machinery for proper processing of the primary transcript, 
glycosylation, and phosphorylation of the gene product may be used. The mam- 
malian cell types listed above are among those that could serve as suitable host 
35 cells. 
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Methods of transfecting mammalian cells and expressing DNA sequences intro- 
duced in the cells are described In e.g. Kaufman and Sharp, J. Mol. Biol. 159, 1982, 
pp. 601-621; Southern and Berg, 1982, J. Mol. Appl. Genet. 1:327-341; Loyteret aL, 
5 1982, Proc. Natl. Acad. Sci. USA 79: 422-426; Wigler et al., 1978, Cell 14:725; Cor- 
saro and Pearson, 1981. in Somatic Cell Genetics 7, p. 603; Graham and van der 
Eb. 1973, Virol. 52:456; and Neumann et aL, 1982. EMBO J, 1:841-845. 

Other eucaryotic production systems are also envisaged by the present invention, 
10 such as the production of the fusion protein in a transgenic plant or animal. 

In another aspect the present invention provides a method for producing a fusion 
protein by ^ 

15 - preparing a gene expression construct as defined above encoding a fusion protein, 

- transforming a host cell culture with the construct, 

- cultivating the host cell culture, thereby obtaining expression and secretion of the 
20 polypeptide into the culture medium, followed by 

obtaining a culture medium comprising recombinant fusion protein, and 

purifying the fusion protein. 

25 

The medium used to culture the cells may be any conventional medium suitable for 
growing mammalian cells, such as a serum-containing or serum-free medium con- 
. • taining appropriate supplements, or a suitable medium for growing insect, yeast or 
fungal cells. Suitable media are available from commercial suppliers or may be pre- 
30 pared according to published recipes (e.g. in catalogues of the American Type Cul- 
ture Collection). Example of culture medium are RPMI-1640 or DMEM supple- 
mented with, e.g., insulin, transferrin, selenium, and foetal bovine serum 

The fusion proteins recombinantly produced by the cells may then be recovered 
35 from the culture medium by conventional procedures including separating the host 
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cells from the medium by centrifugation or filtration, precipitating the proteinaceous 
components of the supernatant or filtrate by means of a salt, e.g. ammonium sul- 
phate, purification by a variety of chromatographic procedures, e.g. HPLC, ion ex- 
change chromatography, affinity chromatography, or the like. 

5 

In a preferred embodiment the fusion protein is purified by use of carbohydrate af- 
finity chromatography as described above. In a preferred embodiment of the inven- 
tion the affinity chromatography is perfonned by means of matrices of mannose, 
hexpse or N-acetyl-glucoseamin derivatized matrices, which are suitable for affinity 
10 chromatography. In particular, an affinity chromatography is used, in which the ma- 
trices have been derivatizised with mannose. 

Purified recombinant fusion protein is in this context to be understood as recombi- 
nant fusion protein purified from cell culture supernatants or body fluids or tissue 
1 5 from transgenic animals purified by use of for example carbohydrate affinity cho- 
matography. 

After application of the culture media the column is washed, preferably by using 
non-denaturing buffers, having a composition. pH and ionic strength resulting in 
20 elimination of proteins, without eluting the fusion protein. Such a buffer may be TBS. 
Elution of fusion protein is performed with a selective desorbing agent, capable of 
efficient elution effusion protein, such as TBS containing a desorbing agent, such 
as EDTA (5 mM for example) or mannose (50 mM for example), and fusion proteins 
are collected. 



25 



Pharmaceutical composition and treatment 



The fusion protein obtained by the present invention may be used for the prepara- 
tion of a pharmaceutical composition for the prevention and/or treatment of various 
30 diseases or conditions. In the present context the term pharmaceutical composition 
is used synonymously with the wording medicament. 

In addition to the fusion protein, the pharmaceutical composition may comprise a 
pharmaceutically acceptable carrier substance and/or vehicles. 

35 • 
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In particular, a stabilising agent may be added to stabilise the fusion proteins. The 
stabilising agent may be a sugar alcohol, saccharide, protein and/or amino-aclds. An 
. example of a stabilising agent may be albumin or maltose. 

Other conventional additives may be added to the pharmaceutical composition de- 
pending on administration form for example. 

In one embodiment the pharmaceutical composition is in a form suitable for injec- 
tions. Conventional canrier substances,. such as isotonic saline, may be used. 

In another embodiment the pharmaceutical, composition is in a form suitable for 
pulmonal administration, such as in the form of a powder for inhalation or creme or 
fluid for topical application. 

A treatment in this context may comprise cure and/or prophylaxis of e.g. the immune 
system and reproductive system by humans and by animals having said functional 
units acting in this respect like those in humans. By conditions to be treated are not 
necessarily meant conditions presently known to be in a need of treatment, but 
comprise generally any condition in connection with current and/or expected need or 
in connection with an improvement of a normal condition. In particular, the treatment 
is a treatment of a condition of deficiency of lectins, such as MBL deficiency. 

In another aspect of the present invention the manufacture is provided of a medica- 
ment consisting of said pharmaceutical compositions of fusion protein intended for 
treatment of conditions comprising cure and/or prophylaxis of conditions of diseases 
and disorders of e.g. the immune system and reproductive system by humans and 
by animals having said functional units acting like those in humans. 

Said diseases, disorders and/or conditions in need of treatment with the compounds 
of the invention comprise eg treatment of conditions of deficiency of MBL, treatment 
of cancer and of infections in connection with immunosuppressive chemotherapy 
including in particular those infections which are seen in connection with conditions 
during cancer treatment or in connection with implantation and/or transplantation of 
organs. The invention also comprises treatment of conditions in connection with 
recurrent miscarriage. 
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Thus, in particular the pharmaceutical composition may be used for the treatment 
and/or prevention of clinical conditions selected from infections, MBL deficiency, 
cancer, .disorders associated with chemotherapy, such as infections, diseases asso- 

5 dated with human immunodeficiency vims (HIV), diseases related with congenital or 
acquired Immunodeficiency. More particularly, chronic inflammatory demyelinating 
polyneuropathy (CIDP. iVIultifocal motoric neuropathy, Multiple scelrosis. Myasthenia 
Gravis. Eaton-Lamberf s syndrome. Opticus Neuritis. Epilepsy; Primary antiphosho- 
llpid syndrome; Rheumatoid arthritis. Systemic Lupus erythematosus, Systemic 

1 0 scleroderma. Vasculitis. Wegner's granulomatosis. Sj0gren's syndrome, Juvenile 
rheumatlod arthritis; Autoimmune neutropenia, Autoimmune haemolytic anaemia. 
Neutropenia; Crohn's disease, Colitis ulcerous. Coellac disease; Asthma, Septic 
shock syndrome. Chronic fatigue syndrome, Psoriasis, Toxic shock syndrome. Dia- 
betes. SInultis. Dilated cardiomyopathy. Endocarditis, Atherosclerosis. Primary 

1 5 hypo/agammaglobulinaemia including common variable immunodeficiency, Wiskot- 
Aldrich syndrome and sen/e combined immunodefiency (SCID). Secondary 
hypo/agammaglobulinaemia in patients with chronic lymphatic leukaemia (CLL) and 
multiple myeloma. Acute and chronic idiopathic thrombocytopenic purpura (ITP), 
Allogenic bone marrow transplantation (BTM), Kawasaki's disease, and Guillan- 

20 Barre's syndrome. 

The route of administration may be any suitable route, such as intravenously, intra- 
musculary, subcutanously or intradermally. Also, pulmonal or topical administration 
is envisaged by the present invention. 

25 

In particular the fusion protein may be administered to prevent and/or treat infections 
in patients having clinical symptoms associated with congenital or acquired MBL 
deficiency or being at risk of developing such symptoms. A wide variety of condi- 
tions may lead to increased susceptibility to infections in MBL-deficient individuals, 
30 such as chemotherapy or other therapeutic cell toxic treatments, cancer . AIDS, ge- 
. netic disposition, chronic infections, and neutropenia. 

• The pharmaceutical composition may thus be administered for a period before the 
onset of administration of chemotherapy or the like and during at least a part of the 
35 . chemotherapy. 
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The fusion protein may be administered as a general "booster" before chemother- 
apy, or it may be administered to those only being at risk of developing MBL defi- 
• ciency. The group of patients being at risk may be determined be measuring the 
5 MBL level before treatment and only subjecting those to treatment whose MBL level 
is below a predetermined level. 

The fusion protein is administered in suitable dosage regimes, in particularly it is 
usually administered at suitable intervals, eg. once or twice a week during chemo- 
10 therapy. 

Normally from 1-100 mg is administered per dosage, such as from 2-10 mg, mostly 
from 5t10 mg per dosage. Mostly about 0.1 mg/kg body weight is administered. 

15 Furthermore, an aspect of the present invention is the use of a recombinant compo- 
sition according to the present invention in a kit-of-parts further comprising another 
medicament. In particular the other medicament may be an anti-microbial medica- 
ment, such as antibiotics. 

20 Concerning miscarriage, it has been reported that the frequency of low plasma lev- 
els of MBL is increased in patients with otherwise not explained recurrent miscar- 
riages, which is the background for lowering of the susceptibility to miscarriage by a 
reconstitution of the MBL level by administration of recombinant MBL in these 
cases. 

25 

As to the nature of compounds of the invention, it appears, that in its broad aspect, 
the present invention relates to compounds which are able to act as opsonins, that 
is, able to enhance uptake by macrophages either through direct interaction be- 
tween the compound and the macrophage or through mediating complement depo- 
30 sition on the target surface. 

Examples 

Example 1 

35 
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Plasmidcloning of FCNMBL-r1, -r2, -r3,-r4,-r5,-r6 and-r7. 



1.1 Summary 

A series of plasmids were constructed for the expression in mammalian cells of 
5 protein fusions between recombinant hurnan mannose-binding lectin 2 gene 
(rhMBL) and human ficolin 2 (FCN2). The vector is derived from a high-copy- 
number C0IEI -based plasmid and is designed to allow protein expression in mam- 
malian systems. The fusion protein expressions are driven by the human cyto- 
megalovirus (CMV) immediate eariy promoter to promote constitutive expression. 
10 Selection is made possible in bacteria by the ampicillin-resistance gene under con- 
trol of the prokaryotic p-lactamase promoter. The neomycin-resistance gene is 
driven by the SV40 early promoter, which provides stable selection with G418 in 
mammalian cells. 

15 1.2 Constructs and experimental work 

In order to express fusion proteins between Ficolin2 and MBL we have designed 
and constructed a series of plasmids. The new recombinant plasmids are based on 
the previously cloned pcDNA2001-cintMBLcDNA. This plasmid contains a synthetic 
intron together with the cDNA for human MBL. 
20 The following fusions were designed (underiined font indicates FCN2 part - italics 
indicate MBL part of the fusion protein.) 

FCN2MBLr1 fSEQ ID NO:118): 

25 FCN2 (signalseq+ collagen+"hinge" to ficolin dom aa131) MBL (from aa129 carbo- 

hyd.bind dom.) .( 



MELDRAVGVLGAATLLLSFLGMAWALQAADTCPEVKMVGLEGSDKLTI.LRGCP- 
GLPGAPGDKGEAGTNGKRGERG 

PPGPPGKAGPPGPNGAPGEPQPCLTGPRTCKDLLDRGHFLSGWHTIYLPDGR- 
PLTFSLGKQVGNKFFLTNGEIMT 



FEKVKALCVKFQASVATPRNAAENGAIQNLIKEEAFLGITDEKTEGQFVDLTGN- 
35 RLTYTNWNEGEPNNAGSDEDC 

VLLLKNGQWNDVPCSTSHUKVCEFPI 
40 FCN2MBLr2 (SEQ ID NO: 119>: 
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FCN2 (signalseq+ coIlagen+"hinge"+part of ficolin-dom. containing pred. coll-coll to 
aa207) MBL (from aa129 carbohyd.bind dom.) 

MELDRAVGVLGAATLLLSFLGMAWALQAADTCPEVKMVGLEGSDKLTILRGCP- 
5 . GLPGAPGDKGEAGTNGKRGERG 

PPGPPGKAGPPGPNGAPGEPQPCLTGPRTCKDLLDRGHFLSGWHTIYLPbCR- 
PLTVLCDMDTDGGGWTVFQRRVD 

10 GSVDFYRDWATYKQGFGSRLGEFWLGNDNIHALTAQGTSELRVDLVDFEDNY- 
QFAKLTFSLGKQVGNKFFLTNGE 



IMTFEKVKALCVKFQASVATPRNAAENGAIQNUKEEAFLGITDEKTEGQFVDLT- 
GNRLTYTNWNEGEPNNAGSD 

EDCVLLLKNGQWNDVPCSTSHLAVCEFPI 



FCN2MBLr3 fSEQ ID NO: 120^: 

FCN2 (signalseq+ collagen to aa92) MBL (from aalOl coil-coil + carbohyd.bind 
dom.) 



MELDRAVGVLGAATLLLSFLGMAWALQAADTCPEVKMVGLEGSDKLTILRGCP- 
25 GLPGAPGDKGEAGTNGKRGERG 

PPGPPGKAGPPGPNGAP PDGDSSL/A^SERIOA/ QrFA/?>ll/?/AfkW/ TPSil Ct- 
KQVGNKFFLTNGEIMTFEKVKALCVKF 

30 QASVATPRNAAENGAIQNLIKEEAFLGITDEKTEGQFVDLTGNRLTYTN- 
WNEGEPNNAGSDEDCVLLLKNGQWND 



VPCSTSHLAVCEFPI 



FCN2MBLr4 fSEQ ID NO: 121^: 



FCN2 (signalseq+ part of collagen to cons.K at aa93) IVIBL (from cons.K at aa77 
rest of coliagen+coil-coil + carbohyd.bind dom.) 

meldravgvlgaatlllsflgmawalqaadtcpevkmvgLegsdkltilrgcp- 
glpgapgdkgeagtngkrgerg 

ppgppgklgppgnpgpsgspgpkgqkgdpgkspdgdsslaaserkalqtema- 
45 rikkwltfslgkqvgnkffltng 

bmtfekvkalcvkfqasva tprnaaengaiqnukeeaflgitdektegqfvdlt- 
gnrltytnwnegepnnags 

50 . dedcvlllkngqwndvpcstshlavcefpi 

FCN2IVIBLr5 fSEQ ID NO: 122^: 
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FCN2 (signalseq+ part of collagen to cons.G at aa69) MBL (from cons.G at aa.64 
rest of coIlagen(containing. "kick")+coil-coil + carbohyd.bind dom.) 

MELDRAVGVLGAATLLLSFLGMAWALQAADtCPEVKMVGLEGSDKLTILRGCP- 
GLPGAPGDKGEAGTNGQGLf?GL 

QGPPGKLGPPGNPGPSGSPGPKGQKGDPGKSPDGDSSLAASERKALQTEMA- 
RIKKWLTFSLGKQVGNKFFLTNGE 

IMTFEKVKALCVKFQASVATPRNAAENGAIQNLIKEEAFLGITDEKTEGQFVDLT- 
GNRLTYTNWNEGEPNNAGSD 

EDCVLLLKNGQWNDVPCSTSHLAVCEFPI 



FCN2MBLr6 fSEQ ID NO: 123): 



MBL (replaced MBLcollagen(aa.41 to aa 99 )+coil-coil + carbohyd.bind dom.) FCN2 
20 (inserted collagen aa.54 to aa.92 ) 

M<^l FPRI PI III F:MVAA<iY<iFTVTCFDAOKTCPA\/IACSS PGCPGLPGAPGDK- . 
GEAGTNGKRGERGPPGPPGKAG 

25 PPGPUGAPSPDGDSSLAASERKALQTEMARIKKWLTFSLGKQVGNKFFLT- 
' NGEIMTFEKVKALCVKFQASVATPR 



NAAENGAIQNLIKEEAFLGITDEKTEGQFVDLTGNRLTYTNWNEGEPNNAGSDED- 
CVLLLKNGQWNDVPCSTSHL 

AVCEFPI 



FCN2MBLr7 fSEQ ID NO; 124); 

MBL (signal seq. to aa.25)FCN2 (collagen to aa93) MBL (from aa100 coil-coil + car- 
bohyd.bind dom.) 



Mf:i FPSil PLf-LLS/U W\/ASy SALQAADTCPEVKMVGLEGSDKLTILRGCPGLPGAP- 
40. GDKGEAGTNGKRGERGPPGP 

PGKAGPPGPmAP SPDGDSSLAASERKALQTEMARIKKWLTFSLGKQVGNKF- 
FLTNGEIMTFEKVKALCVKFQAS 

45 VATPRNAAENGAIQNLIKEEAFLGITDEKTEGQFVDLTGNRLTYTNWNEGEPN- 
NAGSDEDCVLLLKNGQWNDVPC 

STSHLAVCEFPI 



SUBSTITUTE SHEET (RULE 26) 



wo 2004/024925 



103 



^CT/DK2003/000585 



Parental plasmids used for all constructions : 
• pcDNA2003-cintMBLcDNA 

Invitrogen Genestorm clone RG000632 (Cat. No. H-K1000 Invitrogen). 
5 Constructions were done by recombination using the BD In-Fusion™ PCR Cloning 
Kit form BD (Cat. No. 631774). The BD In-Fusion Kit allows the cloning of PCR . 
products based only on 2 x 1 5 bp homology between vector and end of the PCR 
product. Ligase, or phosphatase are unnecessary when cloning with this kit. The In- 
Fusion enzyme captures the DNA fragment ends. and fuses the insert to the vector, 

10 Primers used for the PCR reactions are shown in table 1 . 

PCR reactions and linearization of vector for recombination 

PCR reactions were done on plasmid "Genestorm RG000632" batch N135r15C di- 
gested with Bstz17l (N135-20B). Primers pairs were used as described below. Kit 
for PCR reactions : PfuUltra™ Hotstart PCR Master Mix Stratagene #600630. The 
15 PCR reaction tubes were run on the BioRAD i-cycler using the temperature profile 
shown in table 2. 

For the recombination reactions the vector pcDNA2001-cintMBLcDNA was line- 
arized by restriction enzyme digestion with the enzymes listed below. 

20 

FCN2WIBLr1: 

PCR using primers : Pr1-xho-MBLFCN + Pr4-Xmn-FCNMBL-rev (product 463 bp) 
Digest of pcDNA2001-cintMBLcDNA : Xhol + XmnI (partial) 

25 FCN2WIBLr2: 

PCR USING PRIMERS : PrI-xho-MBLFCN + Pr5-Xmn-FCNMBL-rev 
Digest of pcDNA2001-cintMBLcDNA : Xhol + XmnI (partial) 

FCN2MBLr3: 

30 PCR USING PRIMERS : Pr1-xho-MBLFCN + Pr6-b-Bsp-FCNMBL-rev 
Digest of pcDNA2001 -cintMBLcDN A : Xhol + BspEI 

FCN2MBLr4: 
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PGR USING PRIMERS : Pr1-=xho-MBLFCN + Pr2-apa-FCNMBL-rev 
Digest of pcDNA2001-cintMBLcDNA : Xhol + Apal 

FCN2MBLr5: 

5 PGR USING PRIMERS : Prl-xHo-MBLFCN + Pr3-apa-FCNMBL-rev 
Digest of pcDNA2001-cintMBLcDNA : Xhol + Apal 

FCN2MBLr6: 

PGR USING PRIMERS : Pr8-BstAP-MBLFCN + Pr6-b-Bsp-FGNMBL-rev 
10 Digest of pcDNA2001-cintMBLcDNA : partial BstAPI + BspEI 

FCN2MBLr7: 

PGR USING PRIMERS : Pr7-Alw-MBLFCN + Pr6-b-Bsp-FCNMBL-rev 
Digest of pcDNA2001-cintMBLcDNA :partial AlwNI + BspEI 

15 

In-Fusion PGR recombination reactions 

In-Fusion PGR recombination reactions were set up using approx. 50-100 ng of 
Quiagen Minelute purified PGR products together with 50-100 ng of Quiagen 
Minelute purified linearized pcDNA2001-cintMBLcDNA . 
20 1/10 of the recombination reactions were transformed into MAX efficiency DH5a 
Gompetent Geils (invitrogen Gat. No. 18258-012). 1/10 and 9/10 from each trans- 
formation were spread on separate LB plates containing 200 ug/ml ampicillin. 
Plates were incubated at ZTC overnight 

Screening for positive clones : At least 6 colonies from each experimental plate were 
25 picl<ed for miniprep plasmid DMA isolation. To determine the presence of insert, . 
DMA was analyzed by restriction digest analysis with the enzyme Psfl. Three Indi- 
vidual positive clones from each reaction were chosen for further work. 
Restriction Analysis 

In order to verify the selected individual recombinant plasmids after the primary 
30 screen we performed an intensive restriction enzyme digestion analysis on the 
plasmid DNA isolated. 

Plasmid DNA of the recombinants were digested with the enzyme shown In table 3. 
The expected fragments are also listed in the table. All recombinant clones tested 
exhibited the expected pattern. Digestion with EcoRI was not as predicted. An addi- 
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tional fragment was observed both in digestion of the recombinant as well of the 
parental plasmid. The discrepancy can be explained by an additional EcoRI s.te on 
the parental plasmid. 
Results 

5 Recombinant plasmids obtained are shown schematically in figures 4-8 for con- 
structs rl . -r2, -r3,-r4, and -r5. 

Example 2 

10 Experiments with transient expression of recombinant fusion proteins of hu- 
man MBL and tiuman FCN2 

2.1 Summary 

We report the expression of recombinant human fusion proteins FCNlVlBLrl . 
FCNMBLr4, FCNMBLr5 and MBL in HEK293 and Per.C6 cells. We found that the 
cell lines in the transient transfection experiment were able to produce at least the 
fusion proteins FCNMBLr4 and FCNMBLrS assembled in active oligomeres with a 
structure primarly similar to MBL oligomer forms 3 and 4. The fusion proteins 
FCNMBLr4 and FCNMBLrS behaved like MBL upon binding to a carbohydrate sur- 
20 face and upon activating the complement cascade. 

2.2 Introduction 

The aim of the studies was to elucidate the possibility of creating a hybrid protein 
consisting of the collagen part of human ficolin 2 and the human mannose binding 
25 lectin (MBL). Furthermore we wished to clarify if such molecules would still posses 
the ability to bind to complex carbohydrate structures and still are able to activate 
complement. 

TWO eukaryotic cell lines of human origin HEK293 and Per.C6 were used as host 
cell lines for transient transfections with the respective expression plasmids. Tran- 
30 scription was driven by.the CMV-IE promoter enhancer. 
2.3 Experimental . 
Material and Methods 

Plasmids used for the transfection exper iments 

pME607-F.CNMBL-r1. -r2. -r3.-r4;-r5.-r6 and-r7 (described in example 1) 
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Origin of Cells used 

PerC6 cells were obtained from Crucell. 

HEK 293 Freestyle cells were obtained from InVitrogen . 

Culture media 

PerC6 cells were cultured at 37'*C in 10% (vol/vol) CO2 maintained as monolayers in 
serum free medium. 

HEK 293 Freestyle were cultured at 37X In 8% (vol/vol) CO2 maintained as sus- 
pension in an InVitrogen Freestyle medium. 

Transfections and harvest of media 

Per.Ce cells in serum containing medium were transfected with the DNA using the 
transfection reagent Lipofectamine. One day after transfection the medium was re- 
placed with serum free medium. 

HEK293 cells in serum free Freestyle medium were transfected with the DNA using 
the transfection reagent 293fect. The medium was collected after approximately 4 
days of incubation after transfection. 
Quantification of MBL 

Recombinant MBL assay (TRIFMA) using Mannan coated plates or mAb-131-01 
coated plates. For quantification of MBL. time-resolved immunofluorometry was car- 
ried out. 

SbS-PAGE and Westem blot analysis 

SDS-PAGE with subsequent electrophoretic transfer of proteins to polyvinylidene 
diflouride membranes and detection of MBL using monoclonal anti-MBL antibody 
was carried out 
C4 assay 

The assay is designed to measure MBL and rMBL abilities to initiate tlie MBL Lectin- 
pathway of the complement system. MBL associated serine protease (MASP 2) 
associated with MBL cleaves the complement factor C4 releasing C4a and C4b. 
The C4b deposition on the Manrian coated ELISA plates is detected with biotin la- 
belled antibodies against C45 and Europium labelled Streptavidin. 
2.4 RESULTS 

, In the experiments described herein we were able to express FCN2MBLr4, 
FCN2MBLr5 and MBL transiently in both HEK293 and Per.C6 cells under serum 
free conditions. 
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Oligomeric form of the fusion proteins 

The oligomeric forms of the fusion protein were examined by non-reducing denatur- 
izing SDS PAGE followed by a western blot. The detecting antibody recognizes the 
CDR part of MBL (and maybe part of the coil-coil region). The results are shown in 
5. figure 10. it is evident from the figure that the most prominent form of the fusion 

proteins FCN2MBLr4 and FCN2MBLr5 is approximately 250 kDa corresponding to 
a 3- or 4-mere of subunits consisting of 3 single protein chains (24 kDa). The ap- 
pearance of the oligomeric form was independent of the host cells used. MBL was 
produced in a wide range of oligomeric forms, 

10 Binding properties 

The fusion proteins were tested for functionality of the MBL carbohydrate binding 
domain by binding to a mannan surface and detection with an antibody that recog- 
nizes the CDR part of MBL (and maybe part of the coil-coil region). The results are 
shown in table 4. It can be concluded that FCN2MBLr4 and FCN2MBLr5 were ex- 

15 pressed just as well as MBL in the host cells and that the fusion proteins bind to a 
mannan surface. 
MASP-2 binding and C4 cleavoe 

The fusion proteins were further tested for the capacity to bind MASP-2 and for acti- 
vating the serine protease of MASP-2. This was done by measuring cleavage of the 
20 MASP-2 substrate complement factor C4 upon binding of the fusion protein to a 

mannan surface. Results are shown in table 5. It can be concluded from these re- 
sults that the fusion proteins FCN2MBLr4 and FCN2MBLr5 preserved the ability to 
bind and activate MASP-2. 
Discussion 

25 The results described herein clearly demonstrate that it is possible to construct fu- 
sion proteins of FCN 2 and MBL with the following properties: 
1. The oligomeric structure of the fusion proteins is more simple than that of the 
MBL protein. 

. 2. The fusion proteins keep the essential property of MBL activation of the comple- 
30 ment cascade upon binding to a dense carbohydrate structure. 

Table 1 . Primers used for the PGR reactions 

Sequence typed in bold shows the 15 bp homology needed for the recombina- 
35 tion into the vector. 
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Primer na- 

1 1111 IX^* • mmm 

me 


DNA Sequence of Primer 


Primer part of 

pcDNA2001- 

cintMBLcdna 


Primer part of 


PM-xho- 
WIBLFCN 


ataggctagcctcgaagctcgcccttcaccatg- 
gaqctggacag ^ ^- 


ataggctagcctcjga 


agctcgcccttcacc 
ataaaqctggacag 


Pr2-aDa- 
FCNWIBL- 


Ccaactttccaggggggcccggggggccacgii 
ctcctctcttlcc 


gj[replaced) 

qqcccccctggaaag 

ttaa 


yyadayciyciyya 

gaacgtggcccccc 


rev 

Pr3-apa- 

FCNIWBL- 

rev 


Ccaactttccaggggggccctgtaagcctctgag 
cccttgtccattggtgcctgcctctcccttggg 


Caagggctca- 

gaggctta- 

caaaacccccctgga 

aaqttgg 


cccaaggga- 

jjdyyooyyv-»au 
caatgga 


Pr4-Xmn- 
FCNMBL- 
rev 


Tggtcaggaagaacttgttcccaacttgtttgcc- 

cagagagaaagt- 
caggggccggcagtcgggcagg 


Ttctctctgggcaaa- 

caagug9H=s- 
caaattcttcctgacc 

a 


cctgcccgactgcc 
aacccctgact 


Pr5-Xmn- 
FCNMBL- 
rev 


Tggtcaggaagaacttgttcccaacttgtttgcc- 

cagagagaacttag- 
caaactggtagttgtcctcaaagtcc 


Ttctctctgggcaaa- 
caagttgggaar 
caaattcttcctgacc 
a 


ggacttigagga- 

caactac- 

cagiugciaag 


Pr6-b-BsD- 
FCNWIBL- 


Gactatcaccatccggaggtgctccgttgggcc- 
caggiggtcc 


ccaaatggtgatagt 


ggac- 

cacctgggcccaac 
qqaqcacct 


rev 

Pr7-Alw- 


Cagcgtcttactcagctctccaggcggcaga- 
cacctatcc 


cagcgtcttactcaa 


ctctccaggcggca 
gacacctqtcc 


MBLFCN 
Pr8-BstAP- 


Ago/^nffjnr.ctacaataattacctgtagcTcic- 


Agacctgccctgca 
ataattqcctatagctct 


ggctgtccggggct 
gcctggggcccc 


MBLFCN 


caggctgtccggggctgcctggggcccc 


cca 



Table 2: 



Cycle 


times 


step 


Temp 


Time 


1 


1x 


1 


95" 


21 min 


2 


30x 


1 


95*" 


0 min 30 sec 






2 


72° (67.5° for r2) 


0 min 30 sec 






3 


72° 


1 min 


3 


1x 


1 


72° 


10 min 


4 


1x 


1 


4° 


oo 



5 



Table 3: 

pcDNA200 

1- 

cintMBLcD 
NA 


pME607- 

FCN2M 

BLr1 


pME607- 

FCN2M 

BLr2 


pME607- 

FCN2M 

BLrS 


pME607- 

FCN2M 

BLr4 


pME607- 

FCN2M 

BLrS 


Psil 

4212 


PstI 

4212 


PstI 

4212 


PstI 
4212 


PstI 

4212 


PstI 
4212 
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1586 . 


1586 


1586 . 


1486 


1586 


1586 


405 


788 


1016 


782 


800 


797 


375 












EcoRI 


EcoRI 


EcoRI 


EcoRI 


EcoRI 


EcoRI 


5810 


6586 


6814 


6580 


6598 


6595 


768 












Xmal 


Xmal 


Xmal 


Xmal 


Xmal 


Xmal 


6578: 


6586 


6814 


6580 


6589 


6595 


BstXI 


BsOCI 


BstXI 


BstXI 


BstXI 


BstXI 


undigested 


6586 


6814 


6580 


6598 


6595 


BstAPI 


BstAPI 


BstAPI 


BstAPI 


BstAPI 


BstAPI 


4622 


4622 


4622 


4622 


4622 


4622 


1469 


1892 


2120 


1886 


1904 


1901 


415 


72 


72 . 


72 


72 


72 


72 












Ncol 


Ncol 


Ncol 


Ncol 


Ncol 


Ncol 


3435 


3435 


3435 


3435 


3435 


3435 


2408 


1747 


1975 


1741 


1759 


1756 


735 


735 


735 


735 


735 


735 




669 


669 


669 


669 


669 
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Table 4 MBL binding to mannan measured by TRIFMA 







iin l^ylRI oot Q 
IviDU. oV^UIVcild ilo 

/ml 

/1 1 II 


rOlNZIVlDLro 


nizi\^%70 




r o IN 41 1 Vi D uro 






rUIM^iVlDLr 1 






IvIdL 






FPN9MRI r4 


HEK293 


0,851 


MBL 


HEK293 


1,885 


FCN2MBLr5 


Per.ce 


0,296 


MBL 


Per.ce 


0,271 


FCN2MBLr4 


Per.ce 


0,077 


FCN2MBLr4 


Per.ce 


0,091 


FCN2MBLr4 


Per.ce 


0,089 


FCN2MBLr4 


Per.ce 


0,035 


MBL 


Per.ce 


0,092 



Table 5 C4 activity of the fusion proteins 





Cells 


Aktivitet +/- 


pME607-FCNMBLr5 clone 1 


HEK293 


+ 


pME607-FCNMBLr5 clone 5 


HEK293 


+ 


pME607-FCNMBLr4 clone 2 


HEK293 


+/+ (after purifica- 
tion) 


PCDNA2001 -cintMBLcDNA 


HEK293 


+ 


pMEe07-FCNMBLr5 clone 5 


Per.ce 


+ 


pMEe07-FCNMBLr4 done 2 (maxi 


Per.ce 


-/(+) (after purifi- 
cation) 
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Claims 



1 . A fusion protein comprising 

1) A first polypeptide sequence derived from a lectin-complement pathway 
5 activating protein or a functional homologue thereof; and 

ii) A second polypeptide sequence derived from a collectin or a functional 

homologue thereof; 
wherein said complement activating protein is not a collectin, 

10 2. The fusion protein according to claim 1 , wherein said first polypeptide sequence 
is capable of activating the lectin-complement pathway. 

3. The fusion protein according to claim 1 , wherein said first polypeptide sequence 
is capable of associating with at least one MASP protein. 

15 

4. The fusion protein according to claim 1, wherein said first polypeptide sequence 
is capable of associating with a IVIASP protein selected from the group consist- 
ing of MASP-1 , MASP-2 and MASP-3 or functional homologues or variants 
hereof. 

20 

5: The fusion protein according to claim 1, wherein the complement activating pro- 
tein is a ficolin. 

6. The fusion protein according to claim 5, wherein the ficolin is selected from the 
25 group consisting of L-ficolin, H-ficolin and M-ficolin. 

7. The fusion protein according to claim 5, wherein the ficolin is L-ficolin. 

8. The fusion protein according to any of claims 1 to 7, wherein said first polypep- 
30 tide sequence comprises at least 10, such as at least 12, for example at least 

15, such as at least 20, for example at least 25, such as at least 30, for example 
at least 35, such as at least 40. for example at least 50 consecutive amino acids 
of a complement activating protein or a sequence at least 70%, such as 80%, for 
example 90%. such as 95% identical thereto. 

35 
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9. The fusion protein according to claim 1 , wlierein the first polypeptide sequence 
comprises the collagen-like domain of a ficolin or a functional homologue or 
variant thereof. 

5 10. The fusion protein according to claim 1 , wherein the first polypeptide sequence 
comprises the collagen-like domain of L-ficolin. 

11. The fusion protein according to claim 1 . wherein the first polypeptidie sequence 
•comprises the cysteine-rich region of a ficolin or a functional homologue thereof. 

10 

12. The fusion protein according to claim 1 , wherein first polypeptide sequence 
comprises the cysteine-rich region of L-ficolin 

13. The fusion protein according to claim 1 , wherein the first polypeptide sequence 
15 comprises the cysteine-rich region and the collagen-like domain of a ficolin or a 

functional homologue or variant thereof. 

14. The fusion protein according to claim 1, wherein first polypeptide sequence com- 
prises the cysteine-rich region and the collagen-like domain of L-ficolin. 

20 

15. The fusion protein according to claim 1 . wherein the first polypeptide sequence 
comprises amino acids 1-77 SEQ ID. NO 1. 

16. The fusion protein according to claim 1, wherein said second polypeptide se- 
25 quence is capable of associating with one or more carbohydrates. 

17. The fusion protein according to claim 1, wherein the collectin is selected from 
the group consisting of MBL (mannose-binding lectin), SP-A (lung surfactant 
protein A), SP-D (lung surfactant protein D), BK (or BC, bovine conglutinin) and 

30 CL-43 (collectin-43). 

18.. The fusion protein according to claim 17. wherein the collectin is MBL. 

19. The fusion protein according to any of claims 1 to .18, wherein said second poly- 
35 peptide sequence comprises at least 10. such as at least 12. for example at 
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least 15, such as at least 20, for example at least 25. such as at least 30, for ex- 
ample at least 35, such as at least 40, for example at least 50 consecutive 
amino acids of a collectin or a sequence at least 70%, such as 80%, for example 
90%. such as 95% identical thereto. 

20. The fusion protein according to claim 1, wherein the second polypeptide se- 
quence comprises the CRD domain of a collectin or a functional homologue or 
vaiant thereof. 

21. The fusion protein according to claim 1, wherein the second polypeptide se- 
quence comprises the CRD domain of MBL. 

22. The fusion protein according to claim 1 , wherein the second polypeptide se- 
quence comprises the neck region of MBL. 

23. The fusion protein according to claim 1 . wherein the second polypeptide se- 
quence comprises the collagen-like domain of MBL. 

24. The fusion protein according to claim 1 . wherein the second polypeptide se- 
quence comprises the neck region and the CRD domain of MBL. 

25. The fusion protein according to claim 1, wherein the second polypeptide se- 
quence comprises the collagen-like domain, the neck region and the CRD do- 
main of MBL. 

26. The fusion protein according to claim 1, wherein the second polypeptide se- 
quence, comprises amino acids 80-228 SEQ ID. NO 2. 

27. The fusion protein according to claim 1, wherein the fusion protein comprises the 
the cysteine-rich region and the collagen-like domain of L-ficolin and the CRD 
domain of MBL. 

28. The fusion protein according to claim 1, wherein the fusion protein comprises the 
cysteine-rich region of L-ficolin and the collagen-like domain, the neck region 
and the CRD domain of MBL. 
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29. The fusion protein according to claim 1 . wherein the fusion protein comprises the 
amino acid sequence as defined by SEQ ID. NO. 3, or a functional homologue 
thereof. 

5 . 

30. The fusion protein according to claim 1, wherein the fusion protein consists of 
the amino acid sequence as defined by SEQ ID. NO. 3. 

31. An isolated nucleic acid comprising a nucleotide sequence encoding the fusion 
10 protein according to any of claims 1 to 30. 

32. A vector comprising the nucleic acid sequence according to claim 31. 

33. A cell comprising the vector according to claim 32. 

15 

34. The cell according to claim 33, wherein the cell is a mammalian cell. 

35. The cell according to claim 33, wherein the cell is a non-mammalian cell. 
20 36. A fusion protein according to any of claims 1 to 30 for use as a medicament. 

37. A method of treatment of a clinical condition in an individual in need thereof 
comprising administering to said individual the fusion protein according to any of 
claims 1 to 30. 

25 

38. The method according to claim 37, wherein the clinical condition is an infection. 

39. The method according to claim 37, wherein the individual is a human being. 

30 40. The method according to claim 37, wherein the individual is a human being suf- 
fering from an increased risk of acquiring an infection. 

41. The method according to claim 37, wherein the individual is a human being with 
subnormal serum MBL level. 

35 
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42. The method according to claim 37, wherein the individual is a human being with 
normal serum MBL level. 

43. Use of the fusion protein according to any of claims 1 to 30 for the preparation of 
a medicament for the treatment of a clinical condition in an individual In need 
thereof. 

44. The use according to claim 43, wherein the clinical condition Is an infection. 

45. The use according to claim 43, wherein the individual is a human being. 

46. The use according to claim 43, wherein the individual is a human being suffering 
from an increased risk of acquiring an infection. 

47. The use according to claim 43, wherein the individual is a human being with sub- 
. normal serum MBL level. 

48. The use according to claim 43, wherein the individual is a human being with 
normal serum MBL level. 

49. A medicament for the treatment or prevention of a clinical condition in an indi- 
vidual in need thereof, comprising the fusion protein according to any of claims 1 
to 30. 

50. The medicament according to claim 49, wherein the clinical condition is an in- 
fection. 

51. The medicament according to claim 49. wherein the individual is a human being. 



SUBSTITUTE SHEET (RULE 26) 



lfl/527191 

DTD6^PCT/PT0 1 0 MAR 2005 



Abstract of the Disclosure 

The present invention relates to a fusion protein capable of activating the 
complement system, the fusion protein comprising a first polypeptide sequence derived from a 
lectin-complement pathway activating protein or a functional homologue thereof; and a second 
polypeptide sequence derived from a collectin or a functional homologue thereof; wherein said 
complement activating protein is not a collectin. A preferred fusion protein comprises amino 
acids of the L-ficolin sequence of figure 1 and amino acids of the MBL sequence shown in figure 
2. The fusion protein is suitable for use in treatment consisting of creation, reconstitution, 
enhancing and/or stimulating the opsonic and/or bactericidal activity of the complement system, 
i.e. enhancing the ability of the immune defence to recognise and kill microbial pathogens, and 
accordingly, the invention relates to a medicament comprising the fusion protein, methods for 
producing said fusion protein and methods for treating diseases, in particular infections. 
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tiaOi^B 1:L Ficolin 



Chain 1 Dom MIscQlycosyfaftan 1 Glycosylalton 2 




FCN2 HUAflAN@2 
313 aa 



GCPGLPOmPG DIQSEAGCTGK: aGERGPPGPP GEKTMSPPSPUG AFt3EBQPC3jT 
. CSPXlTCICDIiXiD RCSHHIiSaffHT lYLPDCREIiT VIiCDMDTDG3 GEJTVFQREVD 
QSVDPyEDWa TYKQGFGSXUj GBPPajGNOTI HaiiTAQCSTSE XiICroiVDFKD 
>rZQ:FAICrRSF KViUSBABEOriSr LvCG^FVEOa AGX3£LTFHNir QSFSrtKDQDISr 
D07TG27t^VM ?0QAI9HyKCTC HVSRJUTGRYXi RGtHGSFAEIG INI^SOEGSN 

Protein TC^32Jan[JMAl^I@2' : 

Kfcolin ZpteciiTBor (C^ilagftii/CbnnQgendoiiiaia-caniamingpmte (FicoIia-BJ (FicoiinB) (Sciam 
Jeclm P35) (Hwcu)lm> (L-Kcolia). 
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pham 1 .DisufRdB 1 DmIfJdB 2 




MBL-CHUMAM 

ETVTCKDAQIC TCPAVIACSS PGI£I^FTCKD 
GRDGg^^gB^ EPGQGDRGiO GPPCJKQQPPG BPGPBGSPGP KGQKGDPGI^S 
PDGDSSIrAaS ERKkliQTEMA RXXKfTtiTTSI. GKQVQKKHFIi TUGEIMTJFITEC 
VBCALCTICFQA SVATP12H2UU3 ETGAIQiSIiIKIB EABMIOTDEK TEGQPVDIiTG 
HRJIiTyCTWNE GfiPKKAGSBG DCVIiIiIiKNGQ KBDVPCSTSH lAVCKPPX 

MANNOSE-BINDING PROTEIN C PRECURSOR (MBP-C) (MBPI) P^iIANNAN-BINDING' 
PltOTEIN) (MAIWOSB-BJOTOINGLBCTIN), 
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Fl^URi5' 3: Fxcolra-MBL fmion 




316 aa 



IiOAABTCFW XMVCSLBGBDK IiTHiRGCPGri "PGaSiPGDKGESA GimaiCRGBBtG 
PPOPPGKAXIP PaPJSiS?lMl>D 6DSSI«A2lSKa KAZiQTEHSRX ICKl^tiTFBIiQK 
QVGKTKPPIiTW (JBUNJXFKKVK AlrCVKFQASV ATPRSAASBG MQWLiIKHBA 
3?I*GimKICrE GQPVDLTGKR LTYtCMWlSEGE PUNAGSOBDC VXtLIiKKlGQWHr 



Pjroldmi!FCN2jnJMAb5@2' : 

FicoJin 2 picciiraor (CoJUbgen/fi^jrinogeaidoiiaain-cTOtaimhg pzoteln 2) (Ficoltiir-Bj nFicoImB) (Senna 
Ie«tiaP35) {EBP-37) CHuOTlin) (L- Ficolm). 
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Figure 4: 



Chimeric intron 




ColEI ori 
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Figure 5: 



Chimeric intron 
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Chimeric intron 
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Figure 7: 



Chimeric intron 
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Figure 8: 



Chimeric intron 
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Figure 10: Western blot of FCN2MBLr4, FCN2MBLr5 arid MBL 
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